Currently, it is diffcult to put in context and compare the results from a given evaluation of a recommender system, mainly because too many alternatives exist when design- ing and implementing an evaluation strategy. Furthermore, the actual implementation of a recommendation algorithm sometimes diverges considerably from the well-known ideal formulation due to manual tuning and modifications observed to work better in some situations. RiVal - a recommender system evaluation toolkit - allows for complete control of the different evaluation dimensions that take place in any experimental evaluation of a recommender system: data split- ting, definition of evaluation strategies, and computation of evaluation metrics. In this demo we present some of the functionality of RiVal and show step-by-step how RiVal can be used to evaluate the results from any recommendation framework and make sure that the results are comparable and reproducible.