Hello! My name is Adrian, and I am a Computer Scientist who likes traveling, mountain hiking, and the wilderness in general. I am currently living in Zürich (Switzerland), and I have previously lived and worked in Romania, California, New York, and the UK (repeatedly in each).
Like many people in my field, I have been interested in Computer Science ever since I was in elementary school, and after graduating from my Bachelor's in 2012, I decided to pursue a Master's, and later a PhD degree at the University of Cambridge, UK, where I was working on Citation Recommendation and reading about Natural Language Processing and Machine Learning until 2016.
Between 2012 and 2017 I was also an editor (and later Departments Chief) of ACM XRDS Magazine in New York City, and nowadays I serve on the Assessment & Search Committee of ACM's Publications Board.
In January 2016, I joined Google Switzerland to design and deploy a Machine Learning-enabled solution that enables identifying and taking down mass-created and/or mass-coordinated fake or abusive Google accounts.
In my spare time, I enjoy reading and mountaineering (see my YouTube Channel).
The Wanish Project
In October 2021, inspired partly by the contrast between how I experienced the city of Zürich before, during, and in the closing stages of the COVID-19 pandemic, I started a personal project for sanitizing videos by allowing the user to remove moving objects from it (including people and moving cars). The working name I gave to this project is Wanish (inspired by the popular household detergent product), and I tested the alpha launch on scenes from around Zürich.
I am attaching concrete examples of outputs from the project (taken at night in the first instance, order to be able to guarantee the stability of sky lighting conditions).
|Sample including a large vehicle passing through the video.|
|Sample including people and a large vehicle passing through the video.|
Timelapse Videos from USB-Connected Webcam
In the second half of 2018 I wrote a Linux commandline application which connects to a USB-mounted webcam and uses it to produce a timelapse video at fixed focus distance, but with auto-adjusting settings for contrast, brightness, saturation, etc. The application also helps the user decide how often to sample frames from the webcam based on desired timelapse speed and available disk space.
I am attaching concrete examples of timelapse videos taken using my tool. The subjects of the videos (which range from 1 day to 1 month in coverage) are some of the plant species I grow in my spare time.
|Lemon trees I started from seed, growing indoors over a 24-hour period in September 2018.|
|Paperwhite narcissus I started from bulbs, sprouting, blooming, and wilting indoors over a 25-day period in December 2018.|
|Crocuses and hyacinths I started from bulbs, sprouting, and blooming over a 25-day period from mid-February to mid-March 2019.|
Machine Learning for NLP Workshop @ ROSEdu Summer Workshops
Between June 15th and June 26th 2015, I designed, organized, and taught a summer school on the topic of Machine Learning for Natural Language Processing (ML4NLP) at the "Politehnica" University of Bucharest. The workshop comprised two tages: (A) a taught component which took place between June 15th and June 19th with two hours of taught lectures daily, followed by a one-hour hands-on practical session in which the participants solved a small problem in order to apply the notions learned, and (B) a weeklong hackathon which took place between June 23rd and June 26th, and during which a selected team of participants worked under my supervision to create a prototype of a post-OCR text regeneration tool.
In total, 13 applicants were selected to participate, with background ranging from High School level to University Lecturer. The feedback was overwhelmingly positive, with an overall average score of 4.6 out of 5 across the feedback forms.
The main page for the summer school (which includes links to all the presentation slides from the first week), can be publicly accessed here.
MPhil Research Dissertation
My MPhil Research Project was focused on using unsupervised and weakly-supervised Machine Learning to Natural Language Processing, under the supervision of Dr. Diarmuid Ó Séaghdha and Prof. Stephen Clark.
In my thesis, titled "Multilingual Generative Models for Selectional Preference Learning" (click thumbnail to download PDF), I investigated the use of Latent Dirichlet Allocation (LDA) for inducing plausibility estimates specific to the selectional preferences of Verb-Subject and Verb-DirectObject pairs in English (and verified state-of-the-art performance on three other European languages), and I tested the feasibility of Vector Space Alignment to transfer the estimates from resource-rich European languages like English (for which dependency parsers can be trained), to languages which do not benefit from a large body of research: German, Spanish, and Romanian.
Project Training Data
As part of my research, I produced clean, dependency-parsed corpora from the non-listy German, Spanish, and Romanian Wikipedia articles in CONLL format, which I am releasing below. Please read the "README.txt" file in each archive for details about how the text was extracted and processed, as well as for information regarding licensing.
Test Dataset for the Romanian Language
For evaluation purposes, I also created the first test dataset for selectional preference estimation in the Romanian language, by eliciting responses online from native raters. You can access the datasets in either PNG or CSV format.
The methodology of compiling the dataset is detailed in the main body of the dissertation. If you wish to publish results based on this dataset, please contact me by email first.
The code developed as part of the project is hosted publicly on my BitBucket account.
|The main project code, mostly written in Java to handle probability tables.||(repository)|
|The code I wrote to strip away Textile markup and retrieve the plain text from the extracted Wikipedia articles.||(repository)|
Spam Detector Robustness Study
In 2013, I carried out a study (click thumbnail to download PDF) on the problem of email spam classification in order to verify the empirical claim that the performance of word-based classifiers as a function of the leading K tokens of an email saturates quickly with increasing values of K. In doing so, I built on the work of (Çıltık and Güngör, 2008), and tested Multinomial Naive Bayes (MNB), Support Vector Machine (SVM), Bayesian Logistic Regression (BLR), and Interpolated Language Models (ILM) approaches on the GenSpam corpus.
The study concluded that the MNB, SVM, and BLR methods are robust to decreasing text message length down to approximately 100 tokens, while ILM is robust down to approximately 20 tokens.
Upon graduation, I wrote my Bachelor Thesis on the topic of automated plagiarism detection in specialized corpora (academic papers in the field of Computer Science), under the supervision of Dr. Traian Rebedea and Prof. Răzvan Rughiniș.
In my thesis, titled "The AuthentiCop System for Plagiarism Detection in Specialized Corpora: Algorithms and Data Processing" (click thumbnail to download PDF), I described building an automated pipeline which can perform plagiarism detection based on the Encoplot algorithm (Grozea and Popescu, 2011). Together with Filip Buruiană, we wrote a paper on the topic of the thesis which got the best paper award at the Student Scientific Paper Session of the "Politehnica" Unversity of Bucharest, 2012.
Teaching Assistant @ Politehnica University of Bucharest
During my undergraduate years, I also acted as a Teaching Assistant for a number of courses at the Politehnica University of Bucharest: Operating Systems Usage, Computer Programming, Data Structures, and Algorithm Design. As a teaching assistant, I taught during laboratory classes, wrote laboratory exercises, came up with homework assignment and final exam questions, wrote tutorials, and managed course repositories. I am publicly releasing some of the work below.
The Algorithm Design Course
|The source code for the official C++ solutions to the 12 coding laboratories is hosted publicly on my BitBucket account.||(repository)|
The repository also contains the source code I wrote for the automated grader and visualizer for the second assignment of the 2011-2012 academic year, which required students to write an engine that can beat our AI at the Connect4 board game. Two video tutorials explaining how to use the visualizer to test hand-written AI bots for Connect4 are given below (Romanian language only).
The Computer Programming Course
As first a TA, and later the head of the TA team for the Computer Programming Course, among my typical duties I adopted and maintained an open source online judge platform for the course laboratories (which is live here). The source code for the platform belongs to the popular Romanian competitive algorithm design online judge Infoarena. By December 2014, the judge reached 638 registered students.
A short video tutorial instructing the students how to use the website is given above (Romanian language only).
Google AI Challenge, Fall 2011
In Fall 2011, I participated in the Google AI Challenge together with a team of 3 colleagues from my undergraduate course. We designed and implemented an AI bot that would control a swarm of ants as they forage for food and wage war against opponent swarms on the map. In the final championship, our bot ranked 63 out of a total of 7,897 teams from all over the world. You can watch an online demo of our AI bot at work below.
|The source code for the project is hosted publicly on my BitBucket account.||(repository)|
In 2010, I led a team to design a Chess engine in C++, which at the end of the year won the tournament organized as part of my Algorithm Design class. We developed the engine to be compatible with the XBoard Chess platform from the GNU Foundation. The implementation was based on an Alpha-Beta prunned NegaMax algorithm, with added support for Quiescence Seach, custom heuristics, and a database of openings from famous chess championships to buy us time in the early stages of the game. We implemented the algorithms over our custom bit-wise representation of the gameboard, hence naming the engine "BitBoard". You can watch an online demo of the engine below.
|The source code for the project is hosted publicly on my BitBucket account.||(repository)|
Crash Course in C++ for Java Programmers @ UPB
In 2011, upon becoming a Teaching Assistant for the Algorithm Design course at the Politehnica University of Bucharest, I promoted the use of C++ at the labs, and I wrote a Crash Course in C++ for Java Programmers handbook (click thumbnail to download PDF - Romanian language only) designed to integrate seamlessly with the OOP curriculum at the university. The hanbook proved successful with later generations as well, and was referenced externally, on websites such as www.itassistant.org.
Following its success, I was later invited in 2012 to teach a Crash Course in C++ workshop at THALES Systems Romania.
Crash Course in C++ Templates and the STL @ ROSEdu CDL
In 2010, a year after having graduated from the first edition of the ROSEdu Community Development Lab (CDL) myself, I taught a crash course in C++ Templates and the Standard Template Library (click thumbnail to download PDF slides - Romanian language only). The course was a success, which prompted me to return and teach it the following year as well.
I was on the editorial board of ACM XRDS, the Association for Computing Machinery’s international grad student magazine, between December 2012 and December 2017. My job as columnist of the Profile department was to select, interview, and write stories about world-class computer scientists and tech leaders whose work is related to each issue’s topic. I was Issue Editor for the Fall 2014 Issue on Natural Language Processing, and got subsequently promoted to Departments Chief, managing the magazine's team of in-house editors. Below is a list of my published articles: In 2012, I worked with American author and economist Andrew Tobias to translate one of his books, entitled "The Best Little Boy in the World", into the Romanian language. The book was subsequently published online on Google Books, and is available to read for free at this link location.
In the summer of 2017, after having collaborated with UPES ACM Student Chapter in Dehradun, Uttarakhand, in the North of India, I was featured in the Summer 2017 Issue of "VOID" magazine, for which I gave a biographical interview. The best and most reliable way to reach me is via e-mail. Please click on the small envelope icon to the bottom-left of the screen to get my address. I am usually quite responsive, and reply within 48 hours. If I take longer, it is probably because I am traveling and I do not have access to Internet.
Published Research/Academic Papers
“Automatic Plagiarism Detection System for Specialized Corpora.”, F. Buruiană, A. Scoică, T. Rebedea, R. Rughiniș, in CSCS 19th International Conference on Control Systems and Computer Science, pp 77-82, IEEE, 2013. PDF bibtex “The Impact of Competitiveness in Open Source on Education Quality: The Romanian Open Source Education Community”, A. Scoică, 2nd Workshop on Education by Research and Competition, 2012. PDF
ACM XRDS International Grad Student Magazine
I was on the editorial board of ACM XRDS, the Association for Computing Machinery’s international grad student magazine, between December 2012 and December 2017. My job as columnist of the Profile department was to select, interview, and write stories about world-class computer scientists and tech leaders whose work is related to each issue’s topic. I was Issue Editor for the Fall 2014 Issue on Natural Language Processing, and got subsequently promoted to Departments Chief, managing the magazine's team of in-house editors. Below is a list of my published articles:
In 2012, I worked with American author and economist Andrew Tobias to translate one of his books, entitled "The Best Little Boy in the World", into the Romanian language. The book was subsequently published online on Google Books, and is available to read for free at this link location.
In the summer of 2017, after having collaborated with UPES ACM Student Chapter in Dehradun, Uttarakhand, in the North of India, I was featured in the Summer 2017 Issue of "VOID" magazine, for which I gave a biographical interview.
The best and most reliable way to reach me is via e-mail. Please click on the small envelope icon to the bottom-left of the screen to get my address. I am usually quite responsive, and reply within 48 hours. If I take longer, it is probably because I am traveling and I do not have access to Internet.