The Image2Biomass competition uses math to check how good our predictions are. You try to guess how much plant matter is in a pasture image. The organizers use a scoring formula to see how close the guesses are to the real values. This score is called a weighted R².
This is the main math formula used to score our predictions. Don’t worry if it looks intimidating. We’ll break it down step by step:
R_w² = 1 - ( Σ w_j (y_j - ŷ_j)² ) / ( Σ w_j (y_j - ȳ_w)² )
In simple terms:
y_j: the actual (true) value of biomass.ŷ_j: the value you predicted.w_j: how important this row is (weights are higher for more important components).ȳ_w: the average of all actual values, adjusted for importance.This is how we calculate the weighted average of all the true biomass values:
ȳ_w = ( Σ w_j y_j ) / ( Σ w_j )
This gives more influence to rows that are considered more important (have a higher weight).
SS_res = Σ w_j (y_j - ŷ_j)²
This adds up all the squared differences between our predictions and the truth, multiplied by how important each one is. The smaller this value, the better.
SS_tot = Σ w_j (y_j - ȳ_w)²
This measures how spread out the true values are, again weighted by importance. It sets the baseline for comparison.
The scoring system gives more weight to some components. This pie chart shows how each one contributes to the final score:
This bar chart shows the average amount of each biomass type in the training data. It helps you understand what is “typical.”
This chart compares our model’s R² score to a dumb baseline that just predicts the mean. It shows that our model explains a lot more of the variation in the data: