133 lines
4.6 KiB
Markdown
133 lines
4.6 KiB
Markdown
---
|
|
title: Maestro
|
|
---
|
|
|
|
Maestro is an OMR, Optical Musicscore Recognition. You can think of it like a
|
|
musical OCR. It was my school project in 2005 in a team of four persons.
|
|
|
|
In the team, I was responsible for lines detection, stanzas deduction, audio
|
|
play and user interface.
|
|
|
|
WHY AN OMR?
|
|
-----------
|
|
|
|
During the second year of undergraduate study, we have to make a project more
|
|
oriented algorithm. It doesn't matter if it has nice graphics, it must be
|
|
powered by interesting algorithms.
|
|
|
|
A lot of people just do an OCR which is basically a text extraction from a
|
|
scanned sheet. This a perfect example of what I just said: all the focus is on
|
|
an efficient algorithm. But we wanted to innovate!
|
|
|
|
Finally, we chose to create an application able to recognize a sheet music and
|
|
play the melody. We named our OMR Maestro.
|
|
|
|
|
|
HOW DOES IT WORK?
|
|
-----------------
|
|
|
|
Here are the basic steps:
|
|
|
|
1. Clean the image
|
|
2. Rotate the image
|
|
3. Lines detection
|
|
4. Stanzas deduction
|
|
5. Cut each stanza into musicscores
|
|
6. Musicscores recognition
|
|
7. Chains deduction
|
|
8. Midi transcription
|
|
9. Audio play
|
|
|
|
First of all, we must clean the image. As a result, we obtain a binary image
|
|
(each pixel is black or white). Then we rotate the image in order to have
|
|
straight lines.
|
|
|
|
After that we detect the lines and deduct the stanzas. We cut each stanza and
|
|
send them to the musicscore recognition module. Once the musciscores are
|
|
recognized, we look for chains like bemols, fast times, etc.
|
|
|
|
Finally, we convert the result in a MIDI file and play it. The MIDI file format
|
|
is easy to create from scratch with a good quality.
|
|
|
|
|
|
LINES DETECTION
|
|
---------------
|
|
|
|
For the lines detection, I created a very simple algorithm. I start from the top
|
|
middle of the image and I go down. If I find a black pixel I go on the left and
|
|
on the right unless I find a white pixel. Then I know the length of a line by
|
|
couting the black pixels.
|
|
|
|
At first, I thought this would be the definition of a line: if the length is
|
|
greater than half the page width, it is a line. But it didn't work because a
|
|
line can have some white pixels because of the imperfection of the image, and
|
|
some lines which are not in a stanza can occur. I had to redefine a line.
|
|
|
|
I introduced the concept of imperfection. We have now a variable initialized at
|
|
5. Everytime we find a white pixel, this variable is decremented and when it
|
|
reached 0, we stop the traversal. But if we find a black pixel, we increment the
|
|
variable (up to 5).
|
|
|
|
|
|
STANZAS DEDUCTION
|
|
-----------------
|
|
|
|
Once we got all the lines of the sheet, we had to group them into stanzas. My
|
|
first approach was to group them 5 by 5. But the problem was that if we miss a
|
|
line, all the stanzas become wrong.
|
|
|
|
So I had to calculate the average distance between two lines, so I could predict
|
|
if we were off the stanza or not during the grouping. Finally, we were able to
|
|
detect stanzas perfectly even if we missed some lines.
|
|
|
|
|
|
USER INTERFACE
|
|
--------------
|
|
|
|
This was the first time I had to use Qt, and I was not sure if I should use Qt
|
|
Designer or do everything in Emacs. Finally, I coded the interface only with
|
|
Emacs, to be sure to fully understand how Qt works.
|
|
|
|
The user interface was pretty simple, since it was just a school project: each
|
|
step was available at the top of the application. You just had to click to go
|
|
through all the processing.
|
|
|
|
|
|
SOUND MODULE
|
|
------------
|
|
|
|
At first I tried to use timidity++ in order to play the generated MIDI file but
|
|
the configuration was way to hard, and we wanted our software to be simple to
|
|
use. Afther a while, I decided to try SDL with SDL_mixer and it was so simple
|
|
that we had a functionnal MIDI player in one night.
|
|
|
|
|
|
MAC OS X PORT
|
|
-------------
|
|
|
|
With the Qt framework, it is quite easy to create a multiplatform application.
|
|
|
|
One avantage of the Mac version was that Qt was directly able to play a MIDI
|
|
file without SDL because it was calling QuickTime. We provided a package file in
|
|
order to install it under Mac OS X, delivering Qt libraires as well for Mac
|
|
users so they didn't have to install Qt to use Maestro (unlike Linux users).
|
|
|
|
|
|
LAST WORDS
|
|
----------
|
|
|
|
Our software doesn't recognize every musicscore and is still perfectible, but we
|
|
are proud of it. Its flaws are only due to the lack of time at the end of the
|
|
year because our recognition engine is quite flexible to be enhanced
|
|
|
|
We couldn't implement everything on time, even with our coding nights before the
|
|
final presentation. Who knows, it could have been a reference since there is not
|
|
so much similar applications. Thanks to Maestro, we got the vice-major place
|
|
among all the projects of the promotion.
|
|
|
|
|
|
TECHNOLOGIES
|
|
------------
|
|
|
|
[Qt](http://www.qt.io/): Cross-platform application framework from Trolltech
|