In data mining and data analytics, tools and techniques once confined to research laboratories are being adopted by forward-looking industries to generate business intelligence for improving decision making. Higher education institutions are beginning to use analytics for improving the services they provide and for increasing student grades and retention. The U.S. Department of Education’s National Education Technology Plan, as one part of its model for 21st-century learning powered by technology, envisions ways of using data from online learning systems to improve instruction.
With analytics and data mining experiments in education starting to proliferate, sorting out fact from fiction and identifying research possibilities and practical applications are not easy. This issue brief is intended to help policymakers and administrators understand how analytics and data mining have been—and can be—applied for educational improvement.
At present, educational data mining tends to focus on developing new tools for discovering patterns in data. These patterns are generally about the microconcepts involved in learning: one-digit multiplication, subtraction with carries, and so on. Learning analytics—at least as it is currently contrasted with data mining—focuses on applying tools and techniques at larger scales, such as in courses and at schools and postsecondary institutions. But both disciplines work with patterns and prediction: If we can discern the pattern in the data and make sense of what is happening, we can predict what should come next and take the appropriate action.
Educational data mining and learning analytics are used to research and build models in several areas that can influence online learning systems. One area is user modeling, which encompasses what a learner knows, what a learner’s behavior and motivation are, what the user experience is like, and how satisfied users are with online learning. At the simplest level, analytics can detect when a student in an online course is going astray and nudge him or her on to a course correction. At the most complex, they hold promise of detecting boredom from patterns of key clicks and redirecting the student’s attention. Because these data are gathered in real time, there is a real possibility of continuous improvement via multiple feedback loops that operate at different time scales—immediate to the student for the next problem, daily to the teacher for the viii next day’s teaching, monthly to the principal for judging progress, and annually to the district and state administrators for overall school improvement.
The same kinds of data that inform user or learner models can be used to profile users. Profiling as used here means grouping similar users into categories using salient characteristics. These categories then can be used to offer experiences to groups of users or to make recommendations to the users and adaptations to how a system performs.
User modeling and profiling are suggestive of real-time adaptations. In contrast, some applications of data mining and analytics are for more experimental purposes. Domain modeling is largely experimental with the goal of understanding how to present a topic and at what level of detail. The study of learning components and instructional principles also uses experimentation to understand what is effective at promoting learning.
These examples suggest that the actions from data mining and analytics are always automatic, but that is less often the case. Visual data analytics closely involve humans to help make sense of data, from initial pattern detection and model building to sophisticated data dashboards that present data in a way that humans can act upon. K–12 schools and school districts are starting to adopt such institution-level analyses for detecting areas for instructional improvement, setting policies, and measuring results. Making visible students’ learning and assessment activities opens up the possibility for students to develop skills in monitoring their own learning and to see directly how their effort improves their success. Teachers gain views into students’ performance that help them adapt their teaching or initiate tutoring, tailored assignments, and the like.
Robust applications of educational data mining and learning analytics techniques come with costs and challenges. Information technology (IT) departments will understand the costs associated with collecting and storing logged data, while algorithm developers will recognize the computational costs these techniques still require. Another technical challenge is that educational data systems are not interoperable, so bringing together administrative data and classroom-level data remains a challenge. Yet combining these data can give algorithms better predictive power. Combining data about student performance—online tracking, standardized tests, teacher-generated tests—to form one simplified picture of what a student knows can be difficult and must meet acceptable standards for validity. It also requires careful attention to student and teacher privacy and the ethical obligations associated with knowing and acting on student data.
Educational data mining and learning analytics have the potential to make visible data that have heretofore gone unseen, unnoticed, and therefore unactionable. To help further the fields and gain value from their practical applications, the recommendations are that educators and administrators:
• Develop a culture of using data for making instructional decisions.
• Involve IT departments in planning for data collection and use.
• Be smart data consumers who ask critical questions about commercial offerings and create demand for the most useful features and uses.
• Start with focused areas where data will help, show success, and then expand to new areas.
• Communicate with students and parents about where data come from and how the data are used.
• Help align state policies with technical requirements for online learning systems.
Researchers and software developers are encouraged to:
• Conduct research on usability and effectiveness of data displays.
• Help instructors be more effective in the classroom with more real-time and data-based decision support tools, including recommendation services.
• Continue to research methods for using identified student information where it will help most, anonymizing data when required, and understanding how to align data across different systems.
• Understand how to repurpose predictive models developed in one context to another.
A final recommendation is to create and continue strong collaboration across research, commercial, and educational sectors. Commercial companies operate on fast development cycles and can produce data useful for research. Districts and schools want properly vetted learning environments. Effective partnerships can help these organizations codesign the best tools.
The above briefing is from an issue developed under the guidance of Karen Cator and Bernadette Adams of the U.S. Department of Education, Office of Educational Technology: (downloadable)
U.S. Department of Education, Office of Educational Technology, Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics: An Issue Brief, Washington, D.C., 2012.
This report is available on the Department’s Web site at http://www.ed.gov/technology. And, it’s in public domain. Its content covers the following topics:
- Personalized Learning Scenarios
- Data Mining and Analytics: The Research Base
- Educational Data Mining
- Learning Analytics
- Visual Data Analytics
- Data Use in Adaptive Learning Systems
- Educational Data Mining and Learning Analytics Applications
- User Knowledge Modeling
- User Behavior Modeling
- User Experience Modeling
- User Profiling
- Domain Modeling
- Learning System Components and Instructional Principle Analysis
- Trend Analysis
- Adaptation and Personalization
- Implementation Challenges and Considerations
- Technical Challenges
- Limitations in Institutional Capacity
- Privacy and Ethics Issues
- Researchers and Developers
- Collaborations Across Sectors