Skip to content | Change text size

M O N A T A R

InfoTech Unit Avatar

FIT5212 Data analysis for semi-structured data

Chief Examiner

This field records the Chief Examiner for unit approval purposes. It does not publish, and can only be edited by Faculty Office staff

To update the published Chief Examiner, you will need to update the Faculty Information/Contact Person field below.

Wray Buntine

NB: This view restricted to entries modified on or after 19990401000000

Unit Code, Name, Abbreviation

FIT5212 Data analysis for semi-structured data (14 Aug 2017, 10:05am) [Analysis semi-struct data (14 Aug 2017, 10:06am)]

Reasons for Introduction

Reasons for Introduction (14 Aug 2017, 10:04am)

The topic of the unit is poorly addressed at Monash. It is an area seeing growing demand in the Data Science community, so much so that some students have mentioned it in informal discussions. It is also an area where Monash faculty distinguish themselves internationally, and thus we can draw on quality material and guidance in preparation of the unit. Finally, the MDS is clearly in need of additional data analysis units.

Reasons for Change (03 Sep 2021, 10:21am)

24/9/2019: Admin - adding 30 minutes reading time to the overall exam duration as per University requirements.

3/2/20: Admin - amendment to the exam reading and noting time. Reducing from 30mins to 10mins (2hrs 10 mins total) to meet University standards.

18/09/2020 - Admin: Update to include new assessment and teaching approach fields as per Handbook requirements.

13/11/2020 - Wray Buntine: change laboratory to tutorial to reflect mode of class

Updated - added online offering assessments

Role, Relationship and Relevance of Unit (14 Aug 2017, 10:12am)

This unit will be offered as an elective in the C6004 Master of Data Science.

Objectives

Objectives (14 Aug 2017, 10:07am)

At the completion of this unit, students should be able to:

  1. appraise what kinds of semi-structured data exist and the problems they present for analysis;
  2. analyse different kinds of algorithms for different kinds of semi-structured data;
  3. develop and modify some standard algorithms for semi-structured data;
  4. examine some characteristic industry problems involving semi-structured data, and analyse the suitability of different algorithms.

Unit Content

ASCED Discipline Group Classification (14 Aug 2017, 10:07am)

020119

Synopsis (14 Aug 2017, 10:07am)

Semi-structured data is one of the fastest growing kinds of data in both the public and private sector, for instance in health. Email collections with sender-recipient graphs, metadata and text content is one example. This unit will explore basic forms of semi-structured data: text, time-sequence data, graphs and multiple relations in a database. Basic machine learning algorithms for these kinds of data will be analysed and applied. Some characteristic industry problems for the application of semi-structured data will also be investigated such as cohort analysis and market-basket analysis.

Prescribed Reading (for new units) (13 Nov 2020, 4:36pm)

Technological requirements

Students will be using Python and Jupyter Notebook for assignments and tutorials. Students are recommended to bring their own laptop for these.

Teaching Methods

Mode (14 Aug 2017, 10:14am)

On-campus

Assessment

Assessment Summary (03 Sep 2021, 10:19am)

Examination (2 hours and 10 minutes): 50%; in-semester assessment: 50%.

  1. Assessment 1: Text Analysis - 25% - ULO: 1, 2, 3
  2. Assessment 2: Graph data analysis - 25% - ULO: 2, 3, 4
  3. Examination - 50% - ULO: 1, 2, 3, 4

  • Assignment 1 30% - ULO: 1, 2, 3
  • Assignment 2 30% - ULO: 2, 3, 4
  • Scheduled Final Quiz 40% - ULO: 1, 2, 3, 4
  • Workloads

    Workload Requirements (13 Nov 2020, 4:35pm)

    Minimum total expected workload equals 12 hours per week comprising:

  • Two hours/week lectures
  • Two hours/week tutorials
  • A minimum of 8 hours per week of personal study (22 hours per week for Monash Online students) for completing lab/tutorial activities, assignments, private study and revision, and for online students, participating in discussions.

    Resource Requirements

    Teaching Responsibility (Callista Entry) (14 Aug 2017, 10:10am)

    FIT

    Prerequisites

    Prerequisite Units (14 Aug 2017, 10:10am)

    FIT5197

    Proposed year of Introduction (for new units) (14 Aug 2017, 10:11am)

    Semester 1, 2018

    Location of Offering (14 Aug 2017, 10:11am)

    Caulfield

    Faculty Information

    Proposer

    Jeanette Niehus

    Approvals

    School: 03 Sep 2021 (Monica Fairley)
    Faculty Education Committee: 03 Sep 2021 (Monica Fairley)
    Faculty Board: 03 Sep 2021 (Monica Fairley)
    ADT:
    Faculty Manager:
    Dean's Advisory Council:
    Other:

    Version History

    14 Aug 2017 Jeanette Niehus Admin: new unit approved at FEC 3/17.
    14 Aug 2017 Jeanette Niehus FIT5212 Chief Examiner Approval, ( proxy school approval )
    14 Aug 2017 Jeanette Niehus FEC Approval
    14 Aug 2017 Jeanette Niehus FacultyBoard Approval - Approved at FEC 3/17 (Item 8.1) 13 July 2017
    24 Sep 2019 Emma Nash modified ReasonsForIntroduction/RChange; modified Assessment/Summary; modified ReasonsForIntroduction/RChange
    03 Feb 2020 Emma Nash modified ReasonsForIntroduction/RChange; modified Assessment/Summary
    18 Sep 2020 Joshua Daniel modified ReasonsForIntroduction/RChange; modified UnitContent/PrescribedReading; modified Assessment/Summary
    13 Nov 2020 Wray Buntine modified Workload/ContactHours; modified UnitContent/PrescribedReading; modified ReasonsForIntroduction/RChange
    21 Dec 2020 Jeanette Niehus FIT5212 Chief Examiner Approval, ( proxy school approval )
    21 Dec 2020 Jeanette Niehus FEC Approval
    21 Dec 2020 Jeanette Niehus FacultyBoard Approval - Approved by FEC via email 17/12/2020
    03 Sep 2021 Monica Fairley modified Assessment/Summary; modified ReasonsForIntroduction/RChange
    03 Sep 2021 Monica Fairley FIT5212 Chief Examiner Approval, ( proxy school approval )
    03 Sep 2021 Monica Fairley FEC Approval
    03 Sep 2021 Monica Fairley FacultyBoard Approval - executively approved 3/9/21

    This version: