The world is a dataset: with the right mindset, one can quickly grasp that data is everywhere. In particular, many new and fashionable topics (blockchain, NFTs, “generative AI” – sigh) are built on data and computer code.
Few people, however, have any idea how *code* works, and what are the true promises (if any) of the advances in these fields.
The course is aimed at plugging that gap, through an introduction to data analytics in Python. While focused on legal data, the skills learned on the basis of legal datasets are transferable to any kind of field. Mastering these skills is critical for your future career in the digital economy: beyond the hype about data and AI, knowing how to code is the best step towards implementing automatizations methods that will, simply, make your life easier. You will also develop the ability to talk (to some
extent) with engineers, something that will prove useful later in your career.
The course should be of particular interest for students who want to work in tech start-ups (especially legal techs), law firms, academia, or in devising public policy. Or just students curious to understand what's an algorithm, exactly. Be ready to do a lot of loops over dubious datasets.
The course is designed as a gradual introduction to Python and relevant methods of legal data analysis, within a 24-hour schedule. The first 12/16 hours are meant to be a sufficient introduction for the purpose of the Final Presentation (see below); the latter hours are reserved for more specialised methods and uses. We will spend, in particular, some time studying Large Language Models (LLMs; think ChatGPT, Claude, or Bard), and discussing how to harness them.
Each course will be built around three elements:
Some opening considerations of a theoretical nature;
Practical teaching, based on python scripts that students will edit and run themselves; and
A number of exercises based on the material covered.
A full syllabus will be sent to the students signing up.
Given the intensive schedule, the course can accommodate only a limited number of students.
Damien CHARLOTIN
Séminaire
français, anglais
This course will introduce students to basic and common data analysis methods and tools, as applied in particular to legal data and legal knowledge. This is, however, a course for beginners: no pre-requisite is required, as we will start from the ground up and cover the bases extensively. Students will only need a computer and an internet connection. If you already know how to code in Python, you may find a large part of the course tedious.
Automne 2024-2025
A midterm exam (50% of the final grade), due in week 8.
A final presentation (50% of the final grade), to be given in week 12.
You will carry out a complete analysis of legal data from the database, including the creation of a dataset (of legal data) and its cleaning.
Paul Ford, What is Code' (Businessweek, 11 June 2015). What is Code.
AT&T Archives: The UNIX Operating System, 1982, video available on Youtube; and o AT__EPERLUET__amp;T Archives: The UNIX Operating System,(vidéo en ligne). Youtube, 1982. ( vue le 23/08/2022)
Damien Charlotin & Wolfgang Alschner, Data Mining, Text Analytics, and Investor-State Arbitration' in Pietro Ortolani et al. (eds.), International Arbitration and Technology (Wolters Kluwer 2022)