website/projects/maestro/index.md

133 lines
4.6 KiB
Markdown
Raw Normal View History

2014-02-01 17:11:40 +01:00
---
title: Maestro
---
Maestro is an OMR, Optical Musicscore Recognition. You can think of it like a
musical OCR. It was my school project in 2005 in a team of four persons.
2014-02-01 17:11:40 +01:00
In the team, I was responsible for lines detection, stanzas deduction, audio
play and user interface.
2014-02-01 17:11:40 +01:00
WHY AN OMR?
-----------
During the second year of undergraduate study, we have to make a project more
oriented algorithm. It doesn't matter if it has nice graphics, it must be
powered by interesting algorithms.
2014-02-01 17:11:40 +01:00
A lot of people just do an OCR which is basically a text extraction from a
scanned sheet. This a perfect example of what I just said: all the focus is on
an efficient algorithm. But we wanted to innovate!
2014-02-01 17:11:40 +01:00
Finally, we chose to create an application able to recognize a sheet music and
play the melody. We named our OMR Maestro.
2014-02-01 17:11:40 +01:00
HOW DOES IT WORK?
-----------------
Here are the basic steps:
1. Clean the image
2. Rotate the image
3. Lines detection
4. Stanzas deduction
5. Cut each stanza into musicscores
6. Musicscores recognition
7. Chains deduction
8. Midi transcription
9. Audio play
First of all, we must clean the image. As a result, we obtain a binary image
(each pixel is black or white). Then we rotate the image in order to have
straight lines.
2014-02-01 17:11:40 +01:00
After that we detect the lines and deduct the stanzas. We cut each stanza and
send them to the musicscore recognition module. Once the musciscores are
recognized, we look for chains like bemols, fast times, etc.
2014-02-01 17:11:40 +01:00
Finally, we convert the result in a MIDI file and play it. The MIDI file format
is easy to create from scratch with a good quality.
2014-02-01 17:11:40 +01:00
LINES DETECTION
---------------
For the lines detection, I created a very simple algorithm. I start from the top
middle of the image and I go down. If I find a black pixel I go on the left and
on the right unless I find a white pixel. Then I know the length of a line by
couting the black pixels.
2014-02-01 17:11:40 +01:00
At first, I thought this would be the definition of a line: if the length is
greater than half the page width, it is a line. But it didn't work because a
line can have some white pixels because of the imperfection of the image, and
some lines which are not in a stanza can occur. I had to redefine a line.
2014-02-01 17:11:40 +01:00
I introduced the concept of imperfection. We have now a variable initialized at
5. Everytime we find a white pixel, this variable is decremented and when it
reached 0, we stop the traversal. But if we find a black pixel, we increment the
variable (up to 5).
2014-02-01 17:11:40 +01:00
STANZAS DEDUCTION
-----------------
Once we got all the lines of the sheet, we had to group them into stanzas. My
first approach was to group them 5 by 5. But the problem was that if we miss a
line, all the stanzas become wrong.
2014-02-01 17:11:40 +01:00
So I had to calculate the average distance between two lines, so I could predict
if we were off the stanza or not during the grouping. Finally, we were able to
detect stanzas perfectly even if we missed some lines.
2014-02-01 17:11:40 +01:00
USER INTERFACE
--------------
This was the first time I had to use Qt, and I was not sure if I should use Qt
Designer or do everything in Emacs. Finally, I coded the interface only with
Emacs, to be sure to fully understand how Qt works.
2014-02-01 17:11:40 +01:00
The user interface was pretty simple, since it was just a school project: each
step was available at the top of the application. You just had to click to go
through all the processing.
2014-02-01 17:11:40 +01:00
SOUND MODULE
------------
At first I tried to use timidity++ in order to play the generated MIDI file but
the configuration was way to hard, and we wanted our software to be simple to
use. Afther a while, I decided to try SDL with SDL_mixer and it was so simple
that we had a functionnal MIDI player in one night.
2014-02-01 17:11:40 +01:00
MAC OS X PORT
-------------
With the Qt framework, it is quite easy to create a multiplatform application.
One avantage of the Mac version was that Qt was directly able to play a MIDI
file without SDL because it was calling QuickTime. We provided a package file in
order to install it under Mac OS X, delivering Qt libraires as well for Mac
users so they didn't have to install Qt to use Maestro (unlike Linux users).
2014-02-01 17:11:40 +01:00
LAST WORDS
----------
Our software doesn't recognize every musicscore and is still perfectible, but we
are proud of it. Its flaws are only due to the lack of time at the end of the
year because our recognition engine is quite flexible to be enhanced
2014-02-01 17:11:40 +01:00
We couldn't implement everything on time, even with our coding nights before the
final presentation. Who knows, it could have been a reference since there is not
so much similar applications. Thanks to Maestro, we got the vice-major place
among all the projects of the promotion.
2014-02-01 17:11:40 +01:00
TECHNOLOGIES
------------
[Qt](https://qt-project.org/): Cross-platform application framework from Trolltech