Lecture (2V+1Ü, 4 ECTS-LP) "Information Retrieval and Data Mining" (Module Description), Course Number INF-24-52-V-7
- Level: Master
- Language: English
Time and Location
- KIS entry
- Monday, 11:45-13:15.
- Room 42-110
- Begin: 15.04.2019
- KIS entry
- Wednesday, 13:45-15:15 (biweekly, alternating with DDM)
- Room 46-110
|15.01.2019||Website is online.|
Please read carefully.
Students need to successfully participate in the exercise sessions, according to the regulations below, in order to get admitted to the final exam.
- There will be 6 exercise sheets.
- The teaching assistant presents the solutions and answers questions.
- There is no mandatory attendance of the exercise sessions; still, we would be happy to see a lively participation.
- Each sheet consists of 3 assignments, which makes 18 assignments in total. Each assignment is equivalent to one point.
- A student needs to reach a total of at least 13 points throughout the semester to qualify for the final exam.
- Solutions to exercise sheets have to be submitted in OLAT.
- Students can work alone or in groups of max. two, determined with the first submission, and upload, individually, the same solution in OLAT, with names of both members on all sheets.
- Students need to mark in OLAT which individual assignments they have managed to solve correctly.
- Students can only mark an assignment as solved, if they have managed to fully complete the assignment in a reasonable manner. In other words, for every part of the assignment the submitted solution has to present an in-depth approach, which does not necessarily have to include every detail of a correct solution.
- In addition, only if the solution of an assignment is done correctly to an extent of ½ or more, the point for that assignment will be given.
- If it is obvious that the mark has been placed in a dishonest attempt to obtain a point without proper engagement with the assignment, the entire sheet is assessed with zero points. For instance, if the marked exercise or parts of it are not done at all.
- Copying solutions from other groups or taking solutions from previously published solution sheets, if clearly identifiable, will cause all involved groups to get immediately disqualified from the course, independent of the number of points accomplished regularly.
- Boolean Information Retrieval (IR), TF-IDF, IR evaluation
- Probabilistic IR, BM25
- Hypothesis testing
- Statistical language models, latent topic models
- Relevance feedback, novelty & diversity
- PageRank, HITS
- Spam detection, social networks
- Inverted lists
- Index compression, top-k query processing
- Frequent itemsets & association rules
- Hierarchical, density-based, and co-clustering
- Decision trees and Naive Bayes
- Support vector machines