|Topic||Publication||Summary / Abstract|
|Design and Implementation of a Research and Education Cybersecurity Operations Center||Cybersecurity and Secure Information Systems -- Advanced Sciences and Technologies for Security Applications. Springer 2019||The growing number and severity of cybersecurity threats, combined with a shortage of skilled security analysts, has led to an increased focus on cybersecurity research and education. In this article, we describe the design and implementation of an education and research Security Operations Center (SOC) to address these issues. The design of a SOC to meet educational goals as well as perform cloud security research is presented, including a discussion of SOC components created by our lab, including honeypots, visualization tools, and a lightweight cloud security dashboard with autonomic orchestration. Experimental results of the honeypot project are provided, including analysis of SSH brute force attacks (aggregate data over time, attack duration, and identification of well-known botnets), geolocation and attack pattern visualization, and autonomic frameworks based on the observe, orient, decide, act methodology. Directions for future work are also be discussed.|
|Dynamic Data Quality for Static Blockchains||The First International Workshop on Blockchain and Data Management (BlockDM 2019) at 35th IEEE International Conference on Data Engineering (ICDE 2019)||Blockchain’s popularity has changed the way people think about data access, storage, and retrieval. Because of this, many classic data management challenges are imbued with renewed significance. One such challenge is the issue of Dynamic Data Quality. As time passes, data changes in content and structure and thus becomes dynamic. Data quality, therefore, also becomes dynamic because it is an aggregate characteristic of the changing content and changing structure of data itself. But blockchain is a static structure. The friction between static blockchains and Dynamic Data Quality give rise to new research opportunities, which the authors address in this paper.|
|Demystifying Blockchain by Teaching It in Computer Science||Consortium for Computing Sciences in Colleges — Northeastern Region (CCSCNE) 2019||This paper demystifies the advanced computer science topic of blockchain by placing it in the context of course and content development. In presenting suggestions for using blockchain as a tool to teach core computer science concepts, the authors reflect on student-centered, research- based projects spent understanding blockchain and developing an elementary implementation. Their experiences led to several teachable moments applicable to many topics across CS curricula including software design, algorithms and data structures, and distributed computing. The authors discuss many definitions of blockchain filtered through the philosophical lens of essence and accidents, give a precise definition of “essential” blockchain, and provide insight to understanding blockchain by presenting several of its structures and their implementation in the context of those curricular topics.|
|A HoneyNet Environment for Analyzing Malicious Actors||2018 IEEE MIT URTC||A honeypot is a web application or other resource that is deceptively constructed to log actions of its users, most (but not all) of whom can be assumed to be malicious actors. A honeynet is a network of honeypots. Thanks to their interconnectedness, honeynets allow for vast amounts of data to be collected for analysis. In this paper we discuss how we came to build a honeynet, its design and implementation, and a few insights gained by analyzing attack data gathered from it.|
|An API Honeypot for DDoS and XSS Analysis||2017 IEEE MIT URTC||Honeypots are servers or systems built to mimic critical parts of a network, distracting attackers while logging their information to develop attack profiles. This paper discusses the design and implementation of a honeypot disguised as a REpresentational State Transfer (REST) Application Programming Interface (API). We discuss the motivation for this work, design features of the honeypot, and experimental performance results under various traffic conditions. We also present analyses of both a distributed denial of service (DDoS) attack and a cross-site scripting (XSS) malware insertion attempt against this honeypot.|
|Text Stream Processing||Encyclopedia of Database Systems. Liu L., Özsu M. (eds) Springer, New York, NY||A text stream is a continuously generated series of comments or small text documents. Each comment or text document may be associated with a time stamp indicating when it was produced or received by a certain device or system. Text stream processing refers to real-time extraction of desired information from text streams (through categorizing and clustering documents in text streams, detecting and tracking topics, matching patterns, and discovering events).|
|Non-relational Streams||Encyclopedia of Database Systems. Liu L., Özsu M. (eds) Springer, New York, NY||A non-relational stream is a continuously generated, ordered collection of data items that are not relational tuples and therefore not readily processed by relational algebraic operators such as selection, projection, join, and aggregation.|
|An Introduction to Dynamic Data Quality Challenges||ACM Journal of Data and Information Quality (JDIQ) - 2017 January, Volume 8 Number 2, Pages 6:1-6:3, ACM Press (DOI: 10.1145/2998575)||We live in an evolving world. As time passes, data changes in content and structure, and thus becomes dynamic. Data quality, therefore, also becomes dynamic because it is an aggregate characteristic of data itself. Thus, our evolving world and Internet of Things (IoT) presents renewed challenges in data quality. IoT data is teeming with multivendor and multiprovider applications, devices, microservices, and automated processes built on social media, public and private datasets, digitized records, sensor logs, web logs, and much more. From intelligent traffic systems to smart healthcare devices, modern enterprises are inundated with a daily deluge of dynamic big data.|
|G* Studio: An Adventure in Graph Databases, Distributed Systems, and Software Development||ACM Inroads - 2016 June, Volume 7 Number 2, Pages 58 - 66, ACM Press (DOI: 10.1145/2896823)||The e-mail from the department chair was urgent. There were several graduate students with no classes to take. “Would somebody please run an independent study?” she asked. The semester was already a few days old. Alan had to strike fast. “I’m in,” he wrote, “I’ll put them to work on my graph database research.” With that, Alan and his new team, which would become known as the G-stars, began a two- semester adventure in graph databases, distributed systems, and software development that resulted in more than 8,000 lines of code over 520 Git commits. This is is their story.|
|The G* Graph Database: Efficiently Managing Large Distributed Dynamic Graphs||DAPD - The Springer Journal of Distributed and Parallel Databases - Volume 33, Issue 4, pp 479-514||
From sensor networks to transportation infrastructure to social networks, we are awash in data.
Many of these real-world networks tend to be large (``big data'') and dynamic, evolving over time.
Their evolution can be modeled as a series of graphs.
Traditional systems that store and analyze one graph at a time cannot effectively handle the complexity and subtlety inherent in dynamic graphs.
Modern analytics require systems capable of storing and processing series of graphs.
We present such a system.
G* compresses dynamic graph data based on commonalities among the graphs in the series for deduplicated storage on multiple servers. In addition to the obvious space-saving advantage, large-scale graph processing tends to be I/O bound, so faster reads from and writes to stable storage enable faster results. Unlike traditional database and graph processing systems, G* executes complex queries on large graphs using distributed operators to process graph data in parallel. It speeds up queries on multiple graphs by processing graph commonalities only once and sharing the results across relevant graphs. This architecture not only provides scalability, but since G* is not limited to processing only what is available in RAM, its analysis capabilities are far greater than other systems which are limited to what they can hold in memory.
This paper presents G*'s design and implementation principles along with evaluation results that document its unique benefits over traditional graph processing systems.
|A Demonstration of Query-Oriented Distribution and Replication Techniques for Dynamic Graph Data||23rd International World Wide Web Conference (WWW 2014)||Evolving networks can be modeled as series of graphs that represent those networks at different points in time. Our G* system enables efficient storage and querying of these graph snapshots by taking advantage of their commonalities. In extending G* for scalable and robust operation, we found the classic challenges of data distribution and replication to be imbued with renewed significance. If multiple graph snapshots are commonly queried together, traditional techniques that distribute data over all servers or create identical data replicas result in inefficient query execution.|
|Efficient Top-K Closeness Centrality Search||30th IEEE International Conference on Data Engineering (ICDE 2014)||Many of today's applications can benefit from the discovery of the most central entities in real-world networks. This paper presents a new technique that efficiently finds the K most central entities in terms of closeness centrality. Instead of computing the centrality of each entity independently, our technique shares intermediate results between centrality computations. Since the cost of each centrality computation may vary substantially depending on the choice of the previous computation, our technique schedules centrality computations in a manner that minimizes the estimated completion time. This technique also updates, with negligible overhead, an upper bound on the centrality of every entity. Using this information, our technique proactively skips entities that cannot belong to the final result. This paper presents evaluation results for actual networks to demonstrate the benefits of our technique.|
|Scalable and Robust Management of Dynamic Graph Data||First International Workshop on Big Dynamic Distributed Data (BD3) at the 39th International Conference on Very Large Data Bases (BD3@VLDB 2013)||Most real-world networks evolve over time. This evolution can be modeled as a series of graphs that represent a network at different points in time. Our G* system enables efficient storage and querying of these graph snapshots by taking advantage of the commonalities among them. We are extending G* for highly scalable and robust operation. This paper shows that the classic challenges of data distribution and replication are imbued with renewed significance given continuously generated graph snapshots. Our data distribution technique adjusts the set of worker servers for storing each graph snapshot in a manner optimized for popular queries. Our data replication approach maintains each snapshot replica on a different number of workers, making available the most efficient replica configurations for different types of queries.|
|Quickly Finding the k Most Central Entities in Large Networks||New England Database Summit 2013||Many of today's applications can benefit from the discovery of the most central entities in real-world networks. Researchers have been developing techniques for finding the k most central entities in a network where the centrality of an entity is defined as the inverse of the average shortest path length from that entity to other entities. These previous techniques compute the centrality of each entity using a traditional single-source shortest path algorithm and then select k entities with the highest centrality values. Given a large network, however, these techniques incur high computational overhead. Our technique overcomes the above limitation. A key principle of our technique is to materialize intermediate results while a vertex's centrality is computed, and then reuse those results to speed up the computation of another vertex's centrality.|
|A Demonstration of the G* Graph Database System||The 29th International Conference on Data Engineering (ICDE 2013)||G* meets new challenges in managing multiple graphs while supporting fundamental graph querying capabilities by storing graphs on a large number of servers while compressing them based on their commonalities. It also allows users to easily express queries on graphs and efficiently execute those queries by sharing computations across graphs.|
|Computational Finance with Map-Reduce in Scala||The 18th International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2012)||This paper presets results of computational finance experiments using actor-based map-reduce in Scala. In general we observe superlinear speedup, super-efficiency, and evidence for a high degree of compute and I/O overlap end-to- end for different hardware platforms. These results should be of interest to academic researchers as well as industry practitioners.|
|A Game Design & Programming Concentration within the Computer Science Curriculum||Proceedings of the 36th SIGCSE Technical Symposium on Computer Science Education ACM Press, Pages 545-550, 2005.||This paper describes initiatives to develop a Game Concentration in the undergraduate Computer Science curriculum. These initiatives contemplate recommendations for existing courses as well as adoption of new courses. (link)|
|Case Study: Oracle database development for the New York State Office of Mental Health||Proceedings of the Ninth Annual Institute on Mental Health Management Information||Summarizes the approach and development methods used to build an information system that reduced costs in time and money in its first year.|
|Core Concepts in Delphi||Series of articles for the Unofficial Newsletter of Delphi Users||Covers fundamental computer science and programming concepts and illustrates them in the object-oriented programming language Delphi.|
|Associate Professor||Compilers, Operating Systems, Graph and Relational Database Systems, Software Development||Marist College|
|Assistant Professor||Database Systems, Compilers, Operating Systems, Technology Entrepreneurship, Software Development||Marist College|
|Adjunct Professor||Compiler Design||Vassar College|
|Award Winner||2009 IBM Faculty Scholarship||IBM Scholars Program (link)|
|Sr. Professional Lecturer||Compilers, Functional Programming in Erlang and Scala, Operating Systems, Software Development Best Practices||Marist College|
|Invited Speaker||American Culture in an IT-Driven Society||Beijing University of Science and Technology|
|Award Winner||2005 IBM Eclipse Innovation Grant||IBM Scholars Program (link)|
|Invited Speaker||E-commerce Software Architecture and Implementation||College for Software Engineering, Graduate School of the Chinese Academy of Sciences|
|Professional Lecturer||E-commerce, Databases, Software Development, Compilers, Networking||Marist College|
|Adjunct Professor||Operating Systems||State University of New York at Westchester|
|Member||Curriculum Advisory Committee||State University of New York at Westchester|
|Guest Lecturer||Advanced Java Programming||Pace University|
|Adjunct Professor||Information and Data Management (Graduate)||Marist College|
|Adjunct Professor||Database Systems||State University of New York at Purchase|
|Adjunct Professor||Object-Oriented Programming in Java, Database Systems||Mount Saint Mary College|
|Language Study: Erlang||Undergraduate||Marist College|
|Introductory Programming with Games||Undergraduate||Marist College|
|Theory of Programming Languages||Undergraduate||Marist College|
|Operating Systems||Undergraduate/Graduate||Marist College|
|E-Commerce Development||Undergraduate||Marist College|
|Advanced Application Development||Undergraduate||Marist College|
|Compiler Design and Implementation||Undergraduate/Graduate||Marist College, Vassar College|
|Data Communications and Networks||Undergraduate/Graduate||Marist College|
|Operating Systems||Undergraduate||SUNY Westchester|
|Fundamentals of Database Systems||Undergraduate/Graduate||Mount Saint Mary College, SUNY Purchase, Marist College|
|Introduction to OOP in .Net||Undergraduate||Marist College|
|Language Study: ML||Undergraduate||Marist College|