Data is an essential component of the daily operations of businesses. This includes data on customers’ needs, which is usually collected to improve processes, products, or services. Big data comes in large volumes, from different sources, with different structures and types, and at high speeds. It is too large to be realistically analyzed by decision-makers on their own. This has pushed decisionmakers into increasing their reliance on statistical analysis.
Big data analytics is also very important in higher education. It facilitates decision-making and improvement of individual and class performance under a competitive environment. Big data in education comes from two main sources: student information systems (including academic backgrounds, enrollment status, student performance, and demographics) and learning management systems (including ‘LMS’, ‘blackboard canvas’, and ‘moodle’ for information on students behavior).
The 6 Vs of big data analysis
The 6 Vs of big data analytics are six main concepts that define big data and the process of analyzing it. They are:
- Volume: the amount of data (e.g., card transactions, number of students using LMS)
- Velocity: the information flows (e.g., financial information, lectures, and exercises attended, notes delivered to students, notes accessed by students, regular conduct of student consultation, number of attending students, exam scores, feedback for students and lecturers on their performance).
- Veracity: the accuracy and trustworthiness of data, the storing process, and the relevance of data to the purpose it was collected Veracity also covers questions of trust and uncertainty.
- Variety: the division of data into structured, semi-structured, and unstructured
- Verification: data verification and security
- Value: determining the necessary activities and acting fast on any issues by students, lecturers, and other individuals or processes to generate value and benefits The 6 Vs of big data analytics can be achieved through three stable pillars, which are as follows.
The three pillars of big data
The three pillars of big data are:
- Data collection or accessing: identifying the valuable, accurate, and relevant information from the big data collected, filtering it, and structuring it in terms of importance, type, etc.
- Data analysis: analyzing the data collected by first by defining the correlation and regression of variables and then diving into deeper further analysis, depending on the complexity of big data
- Visualization and application: the creation and presentation of the analyzed data as accessible and making it available for use to the users decision-makers
Big Data in higher education institutions: Types and purpose
The most common types of big data stored by higher education institutions include administrative data, department data, curriculum data, teaching and learning data, research data, and student data. Such data are usually collected in order to evaluate the future performance and challenges across academics, research, teaching, learning, outcomes, and growth of institutions.
There are three main categories of big data characteristic to higher education institutions:
- Administrator: an institution’s academic programming, the allocation of financial and human resources by taking in the concern of the capacity building and supporting ongoing efforts
- Students: the main proactive feedback of student performance in class, exercises, and participation, learning pathway, and planning learning activities
- Lecturers: the lectures, continuous improvement of teaching, and instant feedback to students and administrators
The three stages of online big data analytics
Online big data goes through three stages:
- Micro stage treats the clickstream data generated by students (interaction of an individual with the learning environment, including simulation, intelligent tutoring systems, previews of courses, videos, games, etc.). This stage allows understanding information of students’ self-regulated learning in real-time. This is done by clustering all information and generating student personal profiles.
- Mid-stage or text data treats students’ digital writing from discussion forums, online assignments, and social media interactions. This stage uses linguistics tools to determine student performance and student writing through cognitive (automatic student feedback and automated grading), social (online dialogue and discussion patterns, transcripts, videos, etc.), behavioral (course engagement, resource seeking), and effective (students self-concept or sentiment, motivation, and engaging in the learning activities e.g., their opinions on the course) methods.
- Macro stage or institutional data treats students’ demographics (admission data, class schedules, enrollments, terms grades). This stage allows understanding information of students on a semester or yearly basis or even sending warnings (signs that students may be at risk of dropping a course or a program), course guidance, and an information system, which gives administrators information regarding the environment and the ability to understand students’ behavior in the future regarding the completion of their degree, their eligibility to graduate, or the need of intervention.
Nevertheless, there can also be overlaps between the three stages, as the data collected covers the entire institution, not just students.
Challenges of online big data and analytics in higher education institutions can be approached through correlation and regression, as well as other statistical and analytical tools that enable accurate and useful information for decision-makers.
Therefore, big data analytics offers numerous benefits that are of great impact in resolving critical issues of the educational system.