questions_report.py - question queries and reports¶
This module provides tools for querying and reporting on students’ answers to questions and related info. This is used by the assignments/grades_report endpoint.
Imports¶
These are listed in the order prescribed by PEP 8.
Standard library¶
Third-party imports¶
Local application imports¶
None.
Questions query¶
Provide a convenient container for storing some results of the query.
Given an query which defines the questions of interest, return the struct grades[user_id][div_id]
:
1grades = ordered dict {
2 str(user_id): { # Will be an ordered dict if user_id == None.
3 str(div_id):
4 namedtuple(str(question type_), int(max points), str(chapter), if user_id == None
5 str(subchapter), str(question number))
6 namedtuple(str(first_name), str(last_name), str(email)) if div_id == None
7 [datetime(timestamp), float(score), answer, correct] otherwise
8 }
9}
This is like a spreadsheet:
grades[user_id][div_id] contains:
Note: informal testing shows that proper indices are critical to making these query run fast enough. They are:
## create index lp_answers_mkey on lp_answers (div_id, sid, course_name, timestamp);
## create index mchoice_answers_mkey on mchoice_answers (div_id, sid, course_name, timestamp);
## create index fitb_answers_mkey on fitb_answers (div_id, sid, course_name, timestamp);
## create index shortanswer_answers_mkey on shortanswer_answers (div_id, sid, course_name, timestamp);
Use the Postgres EXPLAIN visualizer to imporve queries.
The name of the course.
A query defining the questions of interest.
Build grades struct and populate with row/col headers.
body data query
Select only this class.
Include the due date in the query if it’s provided. Define a query to select the newest (maximum) useinfo row for each unique set of (sid, div_id).
This selects the newest (max) useinfo.timestamp
for each record. Note that any selected field must appear in the group by clause – see https://blog.jooq.org/2016/12/09/a-beginners-guide-to-the-true-order-of-sql-operations/.
Now, make this an SQL clause. This is required, since SQL can’t select any fields outside of the groupby fields, and adding additional fields causes the max not to work.
Omit the closing semicolon from the previous query to use it as a subquery.
Use this id to limit the query.
Left join fields…
db.question_grades.score,
db.clickablearea_answers.answer,
db.clickablearea_answers.correct,
db.code.code,
db.codelens_answers.answer,
db.codelens_answers.correct,
db.dragndrop_answers.answer,
db.dragndrop_answers.correct,
db.fitb_answers.answer,
db.fitb_answers.correct,
db.lp_answers.answer,
db.lp_answers.correct,
db.mchoice_answers.answer,
db.mchoice_answers.correct,
db.parsons_answers.answer,
db.parsons_answers.correct,
db.shortanswer_answers.answer,
Get to the answer/correct fields for various problems, if they exist – hence the left join.
Include a question grade, if one exists.
Join a question grade to the question that was graded.
Join to auth_user
to get information about each user.
Include answer and correct fields for each question type.
db.clickablearea_answers.on(
(db.useinfo.timestamp == db.clickablearea_answers.timestamp)
& (db.useinfo.sid == db.clickablearea_answers.sid)
& (db.useinfo.div_id == db.clickablearea_answers.div_id)
& (db.clickablearea_answers.course_name == course_name)
),
db.code.on(
(db.useinfo.timestamp == db.code.timestamp)
& (db.useinfo.sid == db.code.sid)
& (db.useinfo.div_id == db.code.acid)
& (db.useinfo.course_id == db.code.course_id)
),
db.codelens_answers.on(
(db.useinfo.timestamp == db.codelens_answers.timestamp)
& (db.useinfo.sid == db.codelens_answers.sid)
& (db.useinfo.div_id == db.codelens_answers.div_id)
& (db.codelens_answers.course_name == course_name)
),
db.dragndrop_answers.on(
(db.useinfo.timestamp == db.dragndrop_answers.timestamp)
& (db.useinfo.sid == db.dragndrop_answers.sid)
& (db.useinfo.div_id == db.dragndrop_answers.div_id)
& (db.dragndrop_answers.course_name == course_name)
),
db.fitb_answers.on(
(db.useinfo.timestamp == db.fitb_answers.timestamp)
& (db.useinfo.sid == db.fitb_answers.sid)
& (db.useinfo.div_id == db.fitb_answers.div_id)
& (db.fitb_answers.course_name == course_name)
),
db.lp_answers.on(
(db.useinfo.timestamp == db.lp_answers.timestamp)
& (db.useinfo.sid == db.lp_answers.sid)
& (db.useinfo.div_id == db.lp_answers.div_id)
& (db.lp_answers.course_name == course_name)
),
db.mchoice_answers.on(
(db.useinfo.timestamp == db.mchoice_answers.timestamp)
& (db.useinfo.sid == db.mchoice_answers.sid)
& (db.useinfo.div_id == db.mchoice_answers.div_id)
& (db.mchoice_answers.course_name == course_name)
),
db.parsons_answers.on(
(db.useinfo.timestamp == db.parsons_answers.timestamp)
& (db.useinfo.sid == db.parsons_answers.sid)
& (db.useinfo.div_id == db.parsons_answers.div_id)
& (db.parsons_answers.course_name == course_name)
),
db.shortanswer_answers.on(
(db.useinfo.timestamp == db.shortanswer_answers.timestamp)
& (db.useinfo.sid == db.shortanswer_answers.sid)
& (db.useinfo.div_id == db.shortanswer_answers.div_id)
& (db.shortanswer_answers.course_name == course_name)
),
),
):
Get the answer and correct info based on the type of question.
Place the query into its appropriate matrix location.
This is a placeholder for the number of attempts.
For SQL performance analysis.
The name of the course.
A query defining the questions of interest.
True to sort the div_ids
by db.assignment_questions.sorting_priority
.
div_id headings query
Get information about each div_id.
Produce the div_id
entries in the order specified by the assignment if requested.
Eliminate duplicates.
Store the max points for this div_id
.
user_id headings query
The base query for this entire function. It’s used to get information about each user_id/div_id combination.
Choose seleted questions.
Join them to useinfo
.
Select only questions in the provided course.
Get information about each user_id.
A query to specify all students in a given course.
First, find all students enrolled in the course. This may include students who answered no questions, hence this part of the query.
Second, find all students who answered a question in this course. Students can later remove themselves from a course, but their answers will still be in the course, hence this part of the query. Likewise, students who aren’t logged in but did answer questions aren’t enrolled in the course, but should also be included.
Remove any students produced by the previous query.
Get the associated auth_user
record if possible.
Ask for distinct entries, since a given student will answer many questions.
distinct=True,
**select_kwargs
):
user_id = row.useinfo.sid
assert user_id not in grades
grades[user_id] = dict()
grades[user_id][None] = _UserInfo._make(
[row.auth_user.first_name, row.auth_user.last_name, row.auth_user.email]
)
return grades, query
def _row_decode(row, question_type):
timestamp = row.useinfo.timestamp
Use a specific table’s timestamp field if at all possible; otherwise, use the useinfo timestamp.
Some queries don’t select the timestamp field. In other cases, it may be present but None, while the useinfo
timestamp is valid. So:
return table.get("timestamp", timestamp) or timestamp
if question_type == "clickablearea":
return (
row.clickablearea_answers.answer,
row.clickablearea_answers.correct,
ts_get(row.clickablearea_answers),
)
elif question_type in ("activecode", "actex"):
ts = row.code.get("timestamp", timestamp) or timestamp
The format of useinfo.act
for code problems with a unit test looks like percent:66.6666666667:passed:2:failed:1
. The code isn’t as useful to display. The grade is None in the code
table, so don’t bother showing it.
Code problems without a unit test won’t be parsed.
return "", None, ts
return row.useinfo.act, float(percent) >= 100, ts
elif question_type == "codelens":
return (
row.codelens_answers.answer,
row.codelens_answers.correct,
ts_get(row.codelens_answers),
)
elif question_type == "dragndrop":
return (
row.dragndrop_answers.answer,
row.dragndrop_answers.correct,
ts_get(row.dragndrop_answers),
)
elif question_type == "fillintheblank":
answer = row.fitb_answers.answer
try:
Guess this is JSON-encoded or empty.
Handle non-JSON encoded fitb answers.
Multiple choice questions store their answer as a comma-separated string. Turn this into an array of ints.
answer = row.mchoice_answers.answer
answer = answer and [int(ans) for ans in answer.split(",")]
return (
answer,
row.mchoice_answers.correct,
ts_get(row.mchoice_answers),
)
elif question_type == "parsonsprob":
return (
row.parsons_answers.answer,
row.parsons_answers.correct,
ts_get(row.parsons_answers),
)
elif question_type == "shortanswer":
Prefer data from the shortanswer table if we have it; otherwise, we can use useinfo’s act.
Try to JSON decode this, for old data.
The newer format is to store the answer as a pure string. So, answer
already has the correct value.
Unknown question! Panic!
Update the grades structure with the number of attempts for each student.
Attempts are collected in a separate query, since the body data query only selects specific timestamps (newest/those graded/etc).
Create an entry for the grade if we don’t have one already.
Fill in the placeholder.
Assignment report¶
Produces grades
for the given course/assignment.
The name of the course.
The name of the assignment.
Verify the course and assignment are valid.
Define the questions of interest, given an assignment and course.
Select the desired assignment.
Restrict the query to a specific course.
Join to assignment_questions
and from there to questions
.
Build grades struct and populate with row/col headers.
body data query
Use this id to limit the query.
for row in db(
query_questions
& (db.question_grades.div_id == db.questions.name)
& (db.question_grades.course_name == course_name)
).select(
db.question_grades.sid,
db.question_grades.div_id,
db.question_grades.score,
db.clickablearea_answers.answer,
db.clickablearea_answers.correct,
db.clickablearea_answers.timestamp,
db.code.code,
db.code.timestamp,
db.codelens_answers.answer,
db.codelens_answers.correct,
db.codelens_answers.timestamp,
db.dragndrop_answers.answer,
db.dragndrop_answers.correct,
db.dragndrop_answers.timestamp,
db.fitb_answers.answer,
db.fitb_answers.correct,
db.fitb_answers.timestamp,
db.lp_answers.answer,
db.lp_answers.correct,
db.lp_answers.timestamp,
db.mchoice_answers.answer,
db.mchoice_answers.correct,
db.mchoice_answers.timestamp,
db.parsons_answers.answer,
db.parsons_answers.correct,
db.parsons_answers.timestamp,
##db.shortanswer_answers.answer,
##db.shortanswer_answers.timestamp,
db.useinfo.timestamp,
db.useinfo.act,
Get to the answer/correct fields for various problems, if they exist – hence the left join.
Include answer and correct fields for each question type.
db.clickablearea_answers.on(
(db.questions.question_type == "clickablearea")
& (db.question_grades.answer_id == db.clickablearea_answers.id)
),
db.code.on(
(db.questions.question_type in ("activecode", "actex"))
& (db.question_grades.answer_id == db.code.id)
),
db.codelens_answers.on(
(db.questions.question_type == "codelens")
& (db.question_grades.answer_id == db.codelens_answers.id)
),
db.dragndrop_answers.on(
(db.questions.question_type == "dragndrop")
& (db.question_grades.answer_id == db.dragndrop_answers.id)
),
db.fitb_answers.on(
(db.questions.question_type == "fillintheblank")
& (db.question_grades.answer_id == db.fitb_answers.id)
),
db.lp_answers.on(
(db.questions.question_type == "lp_build")
& (db.question_grades.answer_id == db.lp_answers.id)
),
db.mchoice_answers.on(
(db.questions.question_type == "mchoice")
& (db.question_grades.answer_id == db.mchoice_answers.id)
),
db.parsons_answers.on(
(db.questions.question_type == "parsonsprob")
& (db.question_grades.answer_id == db.parsons_answers.id)
),
## TODO: currently, the autograder stores the ID of the associated useinfo entry, so this code is wrong. If the autograder is updated, then can be used.
##db.shortanswer_answers.on(
## (db.questions.question_type == "shortanswer")
## & (db.question_grades.answer_id == db.shortanswer_answers.id)
##),
The autograder for interaction-only questions stores a useinfo ID. Get info from there for these questions.
If a student answers no questions, then is autograded, then is removed from the course, the headings query doesn’t contain this student. Add them in.
username = row.question_grades.sid
if username not in grades:
au_row = (
db(db.auth_user.username == username)
.select(
db.auth_user.first_name, db.auth_user.last_name, db.auth_user.email
)
.first()
)
grades[username] = dict()
grades[username][None] = _UserInfo._make(
[au_row.first_name, au_row.last_name, au_row.email]
)
Get the answer and correct info based on the type of question.
Place the query into its appropriate matrix location.
This is a placeholder for the number of attempts.
Send the grades
to the web client by transforming it to dict of data for use with Handsontable.
The grades
data structure returned from query_assignment
.
Convert the iterator returned by a dict to a list. Otherwise, the iterator will be used up after producing the first row of data.
Return a student’s score, or 0 if no score was reported.
Return a “” if a divide by zero occurs.
Collect the points for each question.
Index 0
Index 1
Index 2
Index 3
Index 4
Index 5
Index 6 and following: rows of grades data.
Index 0
Index 1
Index 2
Index 3
Index 4 – each student’s grade. Skip the _UserInfo obtained from grades[user_id][None]
.
Get grades[None].keys()[1:], but using iterator syntax.
Compute the average score and average attempts for each question.
Merge cells when the location is identical.
Start at the beginning of real data – indices 0-4 is userid, first name, last name, e-mail, avg grade.
End one index before the actual end, since this must compare index to index+1.
Loop if locations are identical to count the colspan size.
Gather the results into a single object to send to the client.
Provide a function that knows how to JSON encode a datetime. This should be passed to the default
parameter in json.dumps
.
All times are in UTC, even though the datetime object doesn’t know that.
from pprint import pprint; pprint(grades)