LLM Model Evaluator
A side-by-side comparison tool that sends a prompt to two LLMs simultaneously and displays both responses in real time. Built to explore how different models handle the same input — useful for evaluating tone, accuracy, verbosity, and reasoning style.
Stack: Python · Streamlit · Groq API
Features:
- Enter any prompt once and send it to two models at the same time
- Responses displayed side-by-side for easy comparison
- Powered by Groq’s fast inference API