Gradio

About the Metrics

Meter - How closely the model sticks to the meter of the lines.
Verse - How closely the model aligns the lines to the verse and chorus breakup.
Focus - How much of the response is extraneous commentary instead of the song. (Focus in particular has a very minor contribution to the final score)
Thinking - The estimated average number of thinking tokens per response. Zero means it's not a reasoning model (or is a hybrid model with reasoning off).