Language Features
Grades the quality and effectiveness of language in a response, focusing on factors such as clarity, coherence, conciseness, and overall communication.
Language feature score helps analyze how well the language used in a response conveys the intended message, whether it addresses the question or issue comprehensively, and if it is free from ambiguity or confusion.
Columns required:
response
: The response given by the model
How to use it?
from uptrain import EvalLLM
OPENAI_API_KEY = "sk-********************" # Insert your OpenAI key here
data = [{
"response": "hey, so quadratic equation solving, I will guide you! just refer to any book on basic algebra it is pretty straightforward, even a dummy can understand."
}]
eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)
res = eval_llm.evaluate(
data = data,
checks = [Evals.CRITIQUE_LANGUAGE]
)
Sample Response:
[
{
"score_fluency": 0.4,
"score_coherence": 0.4,
"score_grammar": 0.4,
"score_politeness": 0.2,
"explanation_fluency": "The text is not fluent and sounds awkward due to the informal language and lack of proper structure.",
"explanation_coherence": "The text lacks coherence as it jumps between different topics without a clear connection.",
"explanation_grammar": "The text contains grammatical errors and informal language that is not suitable for a professional or academic setting.",
"explanation_politeness": "The tone is impolite and condescending, using the term \"dummy\" which is inappropriate."
}
]
The reponse generated does not seem good, it has innapropriate words like “dummy”, there are some grammatical errors and uses unnecessary slangs like: “I will guide you”, “it is pretty straightforward”
Resulting in low language feature scores.
How it works?
We evaluate language features by determining which of the following three cases apply for the given task data across features such as fluent, polite, grammatically correct, and coherent:
- The response is highly rated on these features.
- The response is moderately rated on these features.
- The response is poorly rated on these features.
Was this page helpful?