System Architecture

To utilise the TrueLearn library for modeling users and predicting their engagement, developers need to convert the raw text data into Wikipedia topics using truelearn.preprocessing. Once the topics are extracted, developers can initialise an EventModel object using truelearn.models. They can then use this model and a label indicating the learner's engagement in the given event to train the classifier in truelearn.learning.


Internally, the classifier uses information from events and labels to update the LearnerModel, which can be used for interactive visualisations via truelearn.utils.visualisations. These visualisations help learners reflect on their learning. In addition, the updated LearnerModel allows the classifier to make accurate predictions about the learner's engagement in future events.


Developers can also use the PEEK dataset, a large novel dataset on learner engagement with educational videos, through truelearn.datasets to experiment with TrueLearn and apply different metrics defined in truelearn.utils.metrics to evaluate the classifier's performance.

Library Structure

The TrueLearn library is designed to be modular and extensible. Developers can easily add new features to the library by creating new classes that inherit from the base classes in each of the respective modules, which are listed below:

  • truelearn.datasets:
    Contains the functions to download and use the PEEK dataset.
  • truelearn.learning:
    Contains the implementation of all TrueLearn classifiers.
  • truelearn.models:
    Contains the LearnerModel and EventModel classes.
  • truelearn.preprocessing:
    Contains functions to extract topics from raw text data (Wikifier).
  • truelearn.tests:
    Contains unit tests for the library.
  • truelearn.utils:
    Contains functions to evaluate the classifiers performance and visualise the LearnerModel.
    • truelearn.utils.metrics:
      Contains evaluation functions for the classifiers such as accuracy, precision, recall, and f1 score.
    • truelearn.utils.visualisations:
      Contains the plotting functions and classes such as RosePlotter, WordPlotter and LinePlotter.

Design Patterns

Functional pattern

To write self-explanatory code, we utilise the functional design pattern via Python's diverse Iterator APIs. We used this pattern in methods that restructure data in truelearn.datasets and classifiers in truelearn.learning. This significantly improves the readability of our codebase as we are programming declaratively rather than imperatively.

Strategy pattern

To ensure the extensibility of our method, we use the strategy pattern in truelearn.datasets. Developers can pass their custom knowledge component generator functions to the load_peek_dataset method, which will generate the knowledge components they want. This is especially useful if developers create their own knowledge components and want the truelearn.datasets package to use their type.

Template pattern

The Knowledge, Novelty and Interest Classifiers share a lot of similar code as they only differ in the way they update the learner model and make predictions. Therefore, we apply the template pattern to implement shared logic and use dynamic dispatch to allow each classifier to implement their respective parts, which reduces code redundancy and makes the code more maintainable.

Duck Typing

Duck typing is a strategy commonly used in Python and other dynamically typed programming languages, it’s based on the idea that an object is not as important as its behaviour. In other words, if an object walks like a duck and quacks like a duck, then it is a duck, regardless of its actual type. We applied this idea when designing truelearn.models to make our library more extensible. For example, developers do not need to inherit our classes to create their knowledge components. They just need to make sure that their implementation has the necessary methods and can be used correctly by other methods in the library.

  • The traditional approach to this means the programmer doesn’t have a way of specifying the necessary functionality as the behaviour is implicit. To overcome this we make use of the Protocol class introduced in Python 3.8 to explicitly define the required behaviour. However, for our library to be compatible with Python 3.7, we use the official Python extension typing_extensions to port the Protocol class to Python 3.7.

Class Diagrams

The following are class diagrams for the three most important sub-packages in truelearn: truelearn.models, truelearn.learning, truelearn.utils.visualisations.






Public API

Please see our documentation for the list of the public API provided.