Clinical Informatics and Data Science Pathway

The Clinical Informatics and Data Science (CI-DS) Pathway is an introduction to the fields of Clinical Informatics and Data Science. It is intended as an opportunity for learners to learn basic concepts, principals, and skills in these fields, as well as act as a resource to help them consider and pursue careers in these areas.




Sessions Led by Local and international experts



GME and ume participants in 2022-23

The weeklong seminar will be held September 16-20, 2024 at Parnassus. 

Clinical Informatics and Data Science Objectives
  • Learn the key vocabulary, principles, and history of Clinical Informatics
  • Understand the breadth of clinical informatics
  • Explore the basic literature of the clinical informatics field
  • Learn basic data science vocabulary and principles
  • Develop basic data science skills
  • Understand possible careers in informatics
Curriculum Overview

The one week seminar is open to all GME trainees and fourth year medical students. We anticipate that the one week seminar will be held in mid to late September of each academic year. The course will be a mixture of lectures, workshops, and panels. There will be various sessions focused on various areas of CI-DS and the planned didactics will be as follows;

Learning will be done through a mix of didactics, self-paced exercises, regular group sessions, and participation in real life informatics/data science experiences.  At the end of the course, learners should come away with a basic understanding of the fields of Clinical Informatics and Data Science, and have practiced some basic data science related skills.  The course is also a great way to meet and interact with local informatics faculty, who can potentially provide additional long-term mentorship and project work:

  • Two days will be focused on Clinical Informatics
  • One day will focus on Data Science 
  • One day on artificial intelligence/machine learning
  • One day on career opportunities in informatics

The longitudinal curriculum will have limited seating and learners are expected to complete 20-40 hours of coursework over the year. There will be some required activities that can be done in your free time.  In addition, the learner has to obtain a certain number of “points” in order to obtain the certificate. Points can be obtained through a variety of activities that the learner can choose from (e.g. local, regional, national informatics webinars/lectures, attending monthly check-ins, UCSF informatics meetings, taking data science courses, going to conferences, interviewing faculty members, etc).

Some highlights of the longitudinal component:

  • Obtain access to the UCSF de-identified data warehouse
  • Do standard queries in the database to learn SQL and important data science concepts
  • Obtain access to the Information Commons cloud computing platform
  • Follow instructions to run a standardized machine learning model in the Information Commons
  • Learners are required to have a question they want to explore/answer using data when they start, and should at least try to answer that question after they have done the structured SQL exercises. It is encouraged for them to complete an academic product (e.g. abstract, paper, etc), but not required. 
  • For the highly motivated, we will try to help link them with faculty mentors to take on additional work/projects per interest and fit.
Pathway Highlights
  • Mentorship, career guidance and exposure to additional training opportunities such as Clinical Informatic Fellowship
  • Develop useful skills for your career that can be translatable to numerous projects
  • Learn how to pull your own deidentified data, how to request changes to the EHR
Frequently Asked Questions

Is there a cost to participate in the CI-DS Pathway?

  • No, the GME Pathways Program, which includes the CI-DS Pathway, is offered free to our UCSF trainees and students.

Do I need to have a project and mentor identified before the course?

  • No. Part of the longitudinal component of this course will introduce you to mentors and the breadth of opportunities that exist within informatics and data analytics. However, we do recommend that longitudinal pathway participants have a data centric question in mind as they take part in the pathway. We hope that as learners develop skills to pull their own data, that they will then use these skills to answer questions that they are curious about. In our opinion, the best way to learn how to code is through solving something that is interesting to the learner.

Who should take the seminar week only?

  • Learners who are, or might be, interested in pursing a career in informatics should strongly consider taking part in both the seminar week as well as the longitudinal pathway. Learners who are interested in developing some understanding and skills in the fields of clinical informatics and data science, but do not have the interest or time to dive deeply into these fields, might consider only attending the seminar week. Seminar week only participants do have the option to opt into the longitudinal pathway after the seminar week if they are not sure.

Who should participate in the longitudinal pathway?

  • Learners may opt to participate in the longitudinal pathway if they wish to remain engaged and have in-depth exposure to a variety of informatics and data science concepts and experiences throughout the rest of the academic year (and beyond). The Pathway will expose learners to a range of additional experiences including lectures, hands on practical projects and cases, machine learning concepts, informatics governance, and more. Those who are interested in a career in informatics should strongly consider participating in the longitudinal pathway.

Can I get protected time to work on the longitudinal components of the clinical informatics and data science pathway?

  • This is determined on a case-by-case basis and the learner's department/division would consider providing dedicated time during the year, but there would have to be strict expectations in terms of academic output.