Posted: Sat Feb 27, 2010 10:45 pm Post subject: Problems with Vocaloid 2
As I have indicated before, I thought that Vocaloid 2 would work better than Vocaloid 1, be more user-friendly, etc.
Instead, I have found it to be the opposite.
I just posted in the Sonika forum, about how I imported a MIDI file with lyrics into Sonika, and to Miriam, and it sounds much better in Miriam. In Sonika, some of the notes sound good, but others (even with the same phoneme) sound very annoyingly nasal.
Now I tried that Voc 2 VSQ with a friend's Prima as well, and it sounds as bad as the Sonika version. (Not the same, but equally bad.) So it appears to me now, that it is not a Sonika problem, but a Vocaloid 2 problem.
It is unfortunate that this board has no provision (that I've seen) to attach a file or files to a post, so I am uploading a few files to box.net, for examples.
The original file is a standard MIDI file created in Sonar. Very simple, short, only the vocal track, with lyrics, durations quantized so no overlap, and punctuation removed from lyrics.
You can download that original MIDI file (not a Vocaloid file) at:
In both the Vocaloid 1 and 2 editors I imported that MIDI file, did the phoneme transformation, and fixed a few bad phonemes, pretty much the same in each editor. Not with any tweaking trying to get a wonderful version, just a few phoneme fixes so far. (Both editors had problems with the lyric"everybody", which required fixing. both times it occurs in the song.) (It didn't take long to fix though, in either editor.)
Perhaps listen to those three .wav files first, before checking out the sources. The Miriam version sounds decent. It is not perfect, could use more tweaking, etc. However, it is a good start, just from importing the .midi file, doing the phoneme transformation, and fixing a few bad phonemes.
The Prima and Sonika versions sound downright awful. not a good start at all. Why?
Is there something I could change in my Vocaloid II editor settings, that would fix this kind of problem?
---------------
I am sure that a Vocaloid master like Giuseppe and others could take this, and make it sound great in either Vocaloid version. In fact, he might even be able to make it sound better in II than I, more expressive possibiities, etc.
However, an app like that should not be written, so that only a few "masters" can make it sound decent. One should get decent results "out of the box", which one can then tweak to make it better.
Why do the Vocaloid II versions, sound so much worse than the Vocaloid I version? Can anyone explain?
Thank you for sharing the files, this way is more easy understand your question.
I have not had problems of transforming the lyric from your midi file
The problem is in the grammar of your language (English) , the word "every"
As you know, I don't know the English language, but Vocaloid yes
How many syllables divide the word "every"? "e-very" or "e-ve-ry"? or perhaps this is a problem of pronunciation?
I have found often this problem in my songs and in my example "Every breath you take" by Miriam I have had to edit all the "every" words (and she says a few),
the rest of words Vocaloid transforms correctly.
Another problem is the pronunciation of the word "let"
Vocaloid pronounced correctly "let" as a free syllable,
but followed by a vowel you don't pronounce the last "t" pure, you make a "t" much softer, very similar to the sound "r" Spanish.
In my file VSQ " let it be ( Sweet Ann http://www.box.net/shared/pz5kkc9ehq )" you can see the example of as I have modified this sound.
I have upload 3 examples with your files:
1- Importing the midi file, correct the word "every" and correct some syllables (adding consonants) for better pronunciation.
Now you will say that you have to do lots of modifications and this software is defective, I don't see it this way.
I have to do many modifications with the background music if we want a good result.
Some day will come that we have to do fewer modifications with Vocaloid.
Sorry, I forget something important, the Vocaloid configuration.
As I have mentioned in the first tutorial of my blog, to my I do not like the configuration that Vocaloid takes as default (attack and vibrations), the attacks are very soft and the vibrations very slow.
Here you can see as I take the Vocaloid configuration as default
Thank you very much for your replies, your examples, etc. I appreciate it.
I have not had a chance yet to download and study your examples, but I will.
Just from reading your post though, I am not sure you completely understood my question in this thread.
i mentioned the problem with the word "everybody", in both editors, but that was not a serious problem at all--easily fixed in both.
My question is--- with using the same source MIDI, imported into both Vocaloid I and Vocaloid II, doing similar tweaking in both to fix phonemes (like "everybody"), that with Miriam i got a result that is fairly decent. It could certainly use improvement (in fact, i have improved on it since I posted the example), but it is a decent rendition to start with.
In Vocaloid II, however, after importing the exact same MIDI file with lyrics, doing very similar phoneme fixing, etc., the result is terrible, played back through either Sonika or Prima.
Did you listen to the three .wav files? As I wrote, do that first, before looking at the .mids and .vsq. Why do the Vocaloid II versions sound terrible, while the Vocaloid I version is decent?
That is my main question in this thread.
-------------------------------
By the way, Giuseppe, as I mentioned in the OP, I know that someone like you, the Vocaloid Master, could take any of those files and fix the problems. (I will look at and study your examples later. Thank you for them.)
However, most users are not Vocaloid Masters, and cannot spend the amount of time on this software, that you apparently have. (Although I am sure everyone is very grateful for your help, tutorials, etc.)
The program should have been made more user-friendly in version II, not just for Vocaloid Masters, but for everyone.
If instead, Version II is less user-friendly, due to a much greater amount of tweaking required, many more hours spent (in comparison with Voc I), just to sound decent, that is a step backwards.
Anyhow, please listen to the three. wav files I uploaded, and tell me why the Sonika and Prima versions sound so much worse than the Miriam version. Did I do something wrong?
I would appreciate a reply from Anders on this as well.
Really, it seems that Miriam sound better, but by various reasons.
Vocaloid1 has a configuration better than Vocaloid2, (or in his fault, more deficient)
In your file wav of Miriam, the vibrato is more fast, and the vibrato gives major quality to the voice.
The attacks are hard, and this improves the sound of the first consonants or vowel.
All the voices Vocaloid1 sound more hard
Every voice is different, has his own personality, please, review the configuration of Vocaloid2,
I am sure that this can improve your" first impression " on having imported a file midi
You repeat constant that I, maybe, have more free time to use Vocaloid.
To do a quality work, besides time, it is necessary to have desire of doing it.
I can spend less time than you, probably because I've spent more hours with the software, but you, and any user, can obtain the same quality with effort.
The time to finish does not have so much importance.
Do and answer the following, in the order specified.
1)--listen to the .wav files i uploaded.
2)-- Do you agree, that the Miriam version, although far from perfect, sounds pretty good, for just having imported the MIDI file, done the phoneme tranformation, and made a few phoneme changes?
3)--Do you agree, that the Sonika and Prima versions (the .wav files I uploaded) sound terrible?
4) The Vocaloid II version was made the same way as the Vocaloid I version, taking about the same amount of time--importing the same standard MIDI File with lyrics, making the same phoneme changes, etc.
Therefore--the main question----
Why do the Sonika and Prima versions sound so much worse than the Miriam version? (Again, referring to my .wav files. (At this point we are not discussing your versions at all.)
Of course, to answer that question, you might want to analyze my vsqs and mids.
First thing though--just listen to the .wavs I uploaded, and explain the immense difference, with Voc I coming up with something decent in little time, a good start, while Voc II, with the same source material and similar tweaking, came up with something that sounds terrible?
Did I do something wrong?
If so, please let me know.
Or, could it be, that there is something seriously flawed with Voc II?
---------------
The question is not whether you, the Vocaloid master, could fix those files. I knew you could. Look above to see my questions.
I believe that I've answered to all your questions.
Have you seen my video tutorial of Prima full?
Initially the file vsq sounds terribly, as you say, but we do not have to stop for the first bad impression. On my video, you can compare the initial bad sound in the vsq and the final result
I believe that I've answered to all your questions.
Have you seen my video tutorial of Prima full?
Initially the file vsq sounds terribly, as you say, but we do not have to stop for the first bad impression. On my video, you can compare the initial bad sound in the vsq and the final result
I am still not sure you understand my main questions.
The point is not whether a project that starts sounding terrible can be fixed to sound good by a Vocaloid master like yourself. i knew you would be able to do that, that is not the point.
i was specifically comparing Vocaloid I with Vocaloid II. WHy does my Miriam .wav sound SO much better than my Prima and Sonika .wavs?
All three involved the same imported source material, and the exact same little bit of tweaking to fix a few minor phoneme problems.
None of the three are perfect, all could use more tweaking. (I have already tweaked the Miriam version much farther than the uploaded file.)
However, since Miriam sounds so much better to start with, i would assume that it would take much more time to get a decent result with Voc 2, while i got a decent result with Voc I with very little tweaking.
Shouldn't the newer version work better than the old one, and require less time to get similar results? In fact though,it seems to be the opposite.
Anyhow--can you answer----why my Sonkia and Prima versions sound MUCH MUCH worse than the Miriam version?
Did I do something wrong with Voc 2?
Once again, I would appreciate Anders's respose to this as well.
There are several reasons why the examples sounds the way they do.
First of all the first note of the Prima version is sung by Sonika, (I assume that's just a mistake).
Both Sonika and Prima are sopranos (albeit very different sounding sopranos) and Miriam is a mezzosoprano. This is important to remember, because that means they have different qualities at different tonal ranges.
The first part of Maiki's song is near, to very near a sopranos lower limit.
Vocaloid voices are based on 'real' singers, which means that vocal range and characteristics needs to be taken into account when we 'make' a melody for Vocaloid voices.
In short, just as with real singers, Vocaloid voices sound best when they sing within their 'comfort zone' and Miriam's zone is different to Sonika's and Prima's.
Because of the complexity of the english language, the database for a Vocaloid voice contains literally thousands of articulations (combinations of consonants and vowels). This is the same for both Vocaloid 1&2
This means that some articulations sound better than others and sometimes the Vocaloid editor picks the 'wrong' articulation.
Which is why there is a user dictionary, so you can save words that don't come out sounding right the first time.
I would therefore say that another reason Miriam sounds better in this example is coincidence. If you did the same thing again, with the same voices but different words, it's very likely that Sonika or Prima would sound better first and if you tried it again you'd get a different 'winner'.
In my opinion Vocaloid 2 is (most of the time)better, because there are improvements like automatic vibrato etc.
I also think that Guiseppi's point about the Vocaloid configuration is very important. The setting(s) of the configuration can make a big difference on how the voices sound initially.
I do not think it is coincidence. This has happened to me multiple times. It is just the first time i posted examples like that.
I didn't know the first note of the prima version was sung by Sonika. Yes, that must have been a mistake.
Repeatedly, i get much poorer initial results with Voc 2 than Voc 1. Of course, either will require additional tweaking, after importing a .mid like that. But if the initial result is so much worse with one the other, one doesn't feel like bothering with it.
The difference is not minor. Anyone can listen to the 3 .mp3 files i uploaded, and hear a major difference in quality. The Miriam one is not perfect, could certainly use more work (I have worked on it more since), but it is decent. The Prima and Sonika versions are not just slightly less good--they are absolutely terrible!
Obviously, I know it is possible to make good-sounding music with Vocaloid 2, as one can hear in demos, such as those of Giuseppe-José. (Though i was surprised--that the mp3s that he posted of the song I uploaded, sounded almost as bad as the .wavs I had uploaded.)
However, if it is going to take many more hours with II than with I, due to the initial result being much worse, I don't know that it's worth it to me, and I will probably give up on Voc II.
One would think it would tend to go the other way--that version 2 would be more user-friendly, require less hours to get a decent result, etc. That does not seem to be the case though.
Could you tell Yamaha that they really need to improve the user friendliness of Voc II, that it should work better than Voc I, not worse? An update is in order. Adding more features is not helpful, if the basic functionality has become worse.
Or--in looking at the midi file i imported, and the VSQ file I created with it (from which I rendered the .wavs), do you see something I did wrong? If so, please let me know.
Thank you.
anders wrote:
Hi, sorry I'm late to this thread.
There are several reasons why the examples sounds the way they do.
First of all the first note of the Prima version is sung by Sonika, (I assume that's just a mistake).
Both Sonika and Prima are sopranos (albeit very different sounding sopranos) and Miriam is a mezzosoprano. This is important to remember, because that means they have different qualities at different tonal ranges.
The first part of Maiki's song is near, to very near a sopranos lower limit.
Vocaloid voices are based on 'real' singers, which means that vocal range and characteristics needs to be taken into account when we 'make' a melody for Vocaloid voices.
In short, just as with real singers, Vocaloid voices sound best when they sing within their 'comfort zone' and Miriam's zone is different to Sonika's and Prima's.
Because of the complexity of the english language, the database for a Vocaloid voice contains literally thousands of articulations (combinations of consonants and vowels). This is the same for both Vocaloid 1&2
This means that some articulations sound better than others and sometimes the Vocaloid editor picks the 'wrong' articulation.
Which is why there is a user dictionary, so you can save words that don't come out sounding right the first time.
I would therefore say that another reason Miriam sounds better in this example is coincidence. If you did the same thing again, with the same voices but different words, it's very likely that Sonika or Prima would sound better first and if you tried it again you'd get a different 'winner'.
In my opinion Vocaloid 2 is (most of the time)better, because there are improvements like automatic vibrato etc.
I also think that Guiseppi's point about the Vocaloid configuration is very important. The setting(s) of the configuration can make a big difference on how the voices sound initially.
I can only comment on the example you posted.
My main point in my previous post was regarding the difference between the tonal ranges of the Vocaloid singers you used.
It's got nothing to do with version 1 or 2, the same applies to both.
Sonika and Prima are sopranos and Miriam is a mezzo-soprano, there is a big difference.
You can't expect Sonika and Prima to sound the same, or as good as Miriam if the melody is in a mezzo-sopranos range. It's like comparing apples and pears.
Btw. the Japanese Vocaloids have the 'preferred' tonal range stated on the box, which I think is a good idea.
I can only comment on the example you posted.
My main point in my previous post was regarding the difference between the tonal ranges of the Vocaloid singers you used.
It's got nothing to do with version 1 or 2, the same applies to both.
Sonika and Prima are sopranos and Miriam is a mezzo-soprano, there is a big difference.
You can't expect Sonika and Prima to sound the same, or as good as Miriam if the melody is in a mezzo-sopranos range. It's like comparing apples and pears.
Btw. the Japanese Vocaloids have the 'preferred' tonal range stated on the box, which I think is a good idea.
Best
Anders
Sorry, Anders, I don't think the problem is just that of range, soprano vs. mezzo soprano. It wasn't just a case of less than optimal sound. It involves horrible unacceptable sound. Listen to those wavs again.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum