Billy Jang is a computer science and mathematics double major writing a computer science thesis on security in machine learning systems. His advisor is Assistant Professor of Computer Science Scott Alfeld.
Q: Could you tell me about your thesis? A: At a high-level, my thesis is taking a look at security vulnerabilities found in a field called machine learning, which is pretty closely related to statistics. My thesis work focuses on whether a hypothetical attacker might be able to reverse engineer specific aspects of a dataset if they have the ability to observe how a given model updates when it receives new data. This is a great project for me since the work ends up being a decent split between math and computer science, which are my two majors. It’s also a great primer for graduate school because I want to eventually pursue a Ph.D. in machine learning.
Q: What do you think is relevant about your thesis for general audiences? A: Considering how much data we put online now, I think these sorts of attacks are always something to worry about. That being said, my work does show that for specific situations there is a limit on what the attacker can learn about the original dataset. This might make us look at other questions related to data privacy, such as, “is a given person in my dataset,” rather than, “what is my entire dataset?”
Q: What are some real world applications of your research? A: I think the canonical example that people think about would be a spam filter. Our data refers to an initial inbox for a person and the spam filter changes a little bit every time more emails come in. If an attacker is able to observe how that spam filter changes for a couple updates, in certain situations, the attacker would be able to know exactly how the spam filter would update for the future. One could also imagine higher stakes situations where an attacker is able to observe a couple updates.
Q: What sparked your thesis idea? A: The idea mostly came from my advisor, Professor Alfeld. We were talking about a related topic called machine teaching, and I thought it would be interesting to extend some of the ideas in machine teaching to this setting in which an attacker gets to observe multiple changes to the model as they come in. Professor Alfeld took that idea and helped me make it into my current project.
Q: What kinds of research have you done for your thesis so far? A: On a given day, I’m either reading papers about some related topic, doing some math to try and figure out a problem or writing code that will run an experiment so I can validate any of the theoretical results that I find. It’s definitely very time consuming, but I’m trying to strike a balance considering it is senior year. My advice would be to do a little bit of work each day. Ultimately, senior year is something that I think most of us want to try and enjoy, especially after a grueling three years of work.
Q: What is the most rewarding part about the process of writing a thesis? A: I’ve been spending a lot of time learning new things in math and computer science so that I can tackle specific problems for my thesis. For example, learning a new programming language or learning more linear algebra. Being able to use those new skills to arrive at some novel solution to a problem is really satisfying. In my situation, this was a bit off because I ended up not really finding a solution but I think the main idea still stands.
Q: What is the hardest part of writing a thesis? A: Probably all of the mistakes. I’ve spent hours in the lab working on some problem only to realize that it’s because there’s some tiny error that I forgot about, or that I made some assumptions about mathematics that are not necessarily true. For example, the latter situation ended up in me submitting work to a workshop that was actually completely wrong. That was definitely stressful, but overall the entire process is still really rewarding.
Q: Do you have any advice for students that plan on writing a thesis? A: I would recommend that they find a way to strike a balance between thesis-ing and enjoying senior year. Easier said than done, but the entire process is still 100 percent worth it.