Research

(With a little effort this might work itself into reasonable shape. Due to perpetual lack of time, the information below should suffice to whet your appetite for now. If you want more information on any of these projects, please drop me a note.)

The overreaching research umbrella at the Parallel Architecture Group at Northwestern (PARAG@N) is energy-efficient computing. At the macro scale, computers consume inordinate amounts of energy, negatively impacting the economics and environmental footprint of computing. At the micro scale, power constraints prevent us from riding Moore's Law. We attack both problems by identifying sources of energy inefficiencies and working at hardware/software techniques for cross-stack energy optimization. Thus, our work extends from circuit and hardware design, through programming languages and OS optimizations, all the way to application software. In a nutshell, our work aims to minimize the overheads associated with data storage and data transfers (e.g., through adaptive memory hierarchy designs, memory technologies, and silicon photonics), computational overheads (e.g., through specialized computing on dark silicon, approximate computing), circuits (e.g., through speculative arithmetic units, fused accelerators), and in the long term aims to push back the bandwidth and power walls by designing 1000+-core virtual macro-chips with nanophotonic interconnects and optical memories. An overview of our research at PARAG@N was presented at an invited talk at IBM T.J. Watson Research Center and Google Chicago in March 2012. That talk is a little old and many things have happened since then, but it is a good starting point.

More specifically, we work on:

Galaxy: Computer Architecture Meets Silicon Photonics

This project combines advances in parallel computer architecture and silicon photonics to develop architectures that break past the power, bandwidth and utilization walls (dark silicon) that plague modern processors. The Galaxy architecture of optically-connected disintegrated processors argues that instead of building monolithic chips, we should split them into several smaller chiplets> and form a "virtual macro-chip" by connecting them with optical links. The optics allow such high bandwidth communication that break the bandwidth wall entirely, and such low latency that the virtual macro-chip behaves as a single tightly-coupled chip. As each chiplet has its own power budget and the optical links eliminate the traditional chip-to-chip communication overheads, the macro-chip behaves as an oversized multicore that scales beyond single-chip area limits, while maintaining high yield and reasonable cost (only faulty chiplets> need replacement). Our preliminary results indicate that Galaxy scales seamlessly to 4000 cores, making it possible to shrink an entire rack's worth of computational power onto a single wafer. The full design was presented at an EPFL talk in 2014 and published at ICS-2014. This project has advanced the state of the art in silicon photonic interconnects by designing laser power-gating NoCs, developing the concept further through co-designing the on-chip NoC with the architecture, escalating the laser power-gating to datacenter optical networks and overcoming the thermal transfer problems of 3D-stacked electro-optical processor/photonics chips. A full list of publications appears in the NSF CCF-1453853 project web page on energy-efficient and energy-proportional silicon photonic manycore architectures, which funded this work.

Elastic Fidelity: Disciplined Approximate Computing

At the circuit level, the shrinking transistor geometries and race for energy-efficient computing result in significant error rates at smaller technologies due to process variation and low voltages (especially with near-threshold computing). Traditionally, these errors are handled at the circuit and architectural layers, as computations expect 100% reliability. Elastic Fidelity computing is based on the observation that not all computations and data require 100% fidelity; we can judiciously let errors manifest in the error-resilient data, and handle them higher in the stack. We develop programming language extensions that allow data objects to be instantiated with certain accuracy guarantees, which are recorded by the compiler and communicated to hardware, which then steers computations and data to separate ALU/FPU blocks and cache/memory regions that relax the guardbands and run at lower voltage to conserve energy. This work was funded by NSF CCF-1218768 and NSF CCF-1217353.

SeaFire: Design for Dark Silicon

While Elastic Fidelity and Elastic Caches cut back on the energy consumption, they do not push the power wall far enough. To gain another order of magnitude, we must minimize the overheads of modern computing. The idea behind the SeaFire project is that instead of building conventional high-overhead multicores that we cannot power, we should repurpose the dark silicon for specialized energy-efficient cores. A running application will power up only the cores most closely matching its computational requirements, while the rest of the chip remains off to conserve energy. Preliminary results on SeaFire have been published at a highly-cited IEEE Micro article in July 2011, an invited USENIX ;login article in April 2012, the ACLD workshop in 2010, a keynote at ISPDC in 2010, an invited presentation at the NSF Workshop on Sustainable Energy-Efficient Data Management in 2011 (the abstract is here, and an invited presentation at HPTS in 2011. This work was funded by an ISEN Booster award and now continues as part of the Intel Parallel Computing Center at Northwestern (here is the Intel Press release) that I co-founded with faculty from the IEMS department.

Elastic Memory Hierarchies

In this project we develop adaptive cache designs and memory hierarchy sub-systems that minimize the overheads of storing, retrieving and communicating data to/from memories and other cores. An incarnation of Elastic Caches for near-optimal data placement was published at ISCA 2009 and won an IEEE Micro Top Picks award in 2010, while newer papers at DATE 2012 and IEEE Computer Special Issue on Multicore Coherence in 2013 present an instance of Elastic Caches that minimize interconnect power by collocating directory meta-data with sharer cores. You can also find an interview on Dynamic Directories conducted by Prof. Srini Devadas (MIT) here. Through this project we also investigated DRAM thermal management techniques, which have been largely overlooked by the community, even though more than a third of energy is consumed on memory, and thermal events play an important role on the overall DRAM power consumption and reliability. Together with fellow faculty Seda and Gokan Memik, we recognized the importance of the problem, and devised techniques to shape the power and thermal profile of DRAMs using OS-level optimizations. We published some of our results on DRAM thermal management at HPCA 2011. This thrust currently focuses on revisiting memory hierarchy designs, optical memories, and new hardware-software co-designs for virtual-to-physical address mapping. This work is partially funded by NSF CCF-1218768 and CCF-1453853.

Publications

2021

ST2 GPU: An Energy-Efficient GPU Design with Spatio-Temporal Shared-Thread Speculative Adders. Vijay Kandiah, Ali Murat Gok, Georgios Tziantzioulis and Nikos Hardavellas. Design Automation Conference (DAC), San Francisco, CA, December 2021.

Pho$: A Case for Shared Optical Cache Hierarchies. Haiyang Han, Theoni Alexoudi, Chris Vagionas, Nikos Pleros and Nikos Hardavellas. ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), July 2021.

Task Parallel Assembly Language for Uncompromising Parallelism. Mike Rainey, Ryan R. Newton, Kyle Hale, Nikos Hardavellas, Simone Campanoni, Peter Dinda and Umut A. Acar. 42nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2021.

2020

CARAT: A Case for Virtual Memory through Compiler- and Runtime-based Address Translation. Brian Suchy, Simone Campanoni, Nikos Hardavellas and Peter Dinda. 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), London, UK, June 2020.

2019

Prospects for Functional Address Translation. Conor Hetland, Georgios Tziantzioulis, Brian Suchy, Kyle Hale, Nikos Hardavellas and Peter Dinda. 27th IEEE International Symposium on the Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), Rennes, France, October 2019.

Breaking Down Barriers: Paths to Fast Thread Synchronization on the Node. Conor Hetland, Georgios Tziantzioulis, Brian Suchy, Mike Leonard, Jin Han, John Albers, Nikos Hardavellas and Peter Dinda. 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC), Phoenix, Arizona, June 2019.

2018

Temporal Approximate Function Memoization. Georgios Tziantzioulis, Nikos Hardavellas and Simone Campanoni IEEE Micro, Special Issue on Approximate Computing, Vol. 38(4), pp. 60-70, July/August 2018.

Unconventional Parallelization of Nondeterministic Applications. Enrico A. Deiana, Vincent St-Amour, Peter Dinda, Nikos Hardavellas and Simone Campanoni. 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Williamsburg, VA, March 2018.

Operator-Level Parallelism. Nikos Hardavellas and Ippokratis Pandis. Encyclopedia of Database Systems, 2nd edition, L. Liu and M. T. Ozsu (Eds.), ISBN 978-1-4899-7993-3, Springer, 2018.

Execution Skew. Nikos Hardavellas and Ippokratis Pandis. Encyclopedia of Database Systems, 2nd edition, L. Liu and M. T. Ozsu (Eds.), ISBN 978-1-4899-7993-3, Springer, 2018.

Inter-Query Parallelism. Nikos Hardavellas and Ippokratis Pandis. Encyclopedia of Database Systems, 2nd edition, L. Liu and M. T. Ozsu (Eds.), ISBN 978-1-4899-7993-3, Springer, 2018.

Intra-Query Parallelism. Nikos Hardavellas and Ippokratis Pandis. Encyclopedia of Database Systems, 2nd edition, L. Liu and M. T. Ozsu (Eds.), ISBN 978-1-4899-7993-3, Springer, 2018.

Stop-&-Go Operator. Nikos Hardavellas and Ippokratis Pandis. Encyclopedia of Database Systems, 2nd edition, L. Liu and M. T. Ozsu (Eds.), ISBN 978-1-4899-7993-3, Springer, 2018.

2017

POSTER: The Liberation Day of Nondeterministic Programs. Enrico A. Deiana, Vincent St-Amour, Peter Dinda, Nikos Hardavellas and Simone Campanoni. 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), Portland, OR, September 2017.

VaLHALLA: Variable Latency History Aware Local-carry Lazy Adder. Ali Murat Gok and Nikos Hardavellas. 27th ACM Great Lakes Symposium on VLSI (GLSVLSI), Banff, Alberta, Canada, May 2017.

Harnessing Path Divergence for Laser Control in Data Center Networks. Yigit Demir, Nikos Terzenidis, Haiyang Han, Dimitris Syrivelis, George T. Kanellos, Nikos Hardavellas, Nikos Pleros, Srikanth Kandula and Fabian Bustamante. In Proceedings of the 2017 IEEE Photonics Society Summer Topical Meeting Series (IEEE SUM), Optical Switching Technologies for Datacom and Computercom Applications (OSDC), San Juan, Puerto Rico, July 2017.
Invited Paper.

Energy Proportional Photonic Interconnects. Yigit Demir and Nikos Hardavellas. In 12th International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC), Stockholm, Sweden, January 2017.

Techniques for Energy Proportionality in Optical Interconnects. Yigit Demir and Nikos Hardavellas. Photonic Interconnects for Computing Systems, G. Nicolescu, S. Le Beux, M. Nikdast and J. Xu (Eds.), The River Publishers' Series in Optics and Photonics, River Publishers, 2017.

2016

Evaluation of K-Means Data Clustering Algorithm on Intel Xeon Phi. S. Lee, W.-k. Liao, A. Agrawal, N. Hardavellas and A. Choudhary. In Proceedings of the 3rd Workshop on Advances in Software and Hardware for Big Data to Knowledge Discovery (ASH), co-located with the IEEE Conference on Big Data (IEEE BigData), Washington, D.C., December 5-8, 2016.

Energy Proportional Photonic Interconnects. Y. Demir and N. Hardavellas. In ACM Transactions on Architecture and Code Optimization (ACM TACO), Vol. 13(5), December 2016.

SLaC: Stage Laser Control for a Flattened Butterfly Network. Y. Demir and N. Hardavellas. In Proceedings of the 22nd IEEE International Symposium on High Performance Computer Architecture (HPCA), Barcelona, Spain, March 2016.

Lazy Pipelines: Enhancing Quality in Approximate Computing. G. Tziantzioulis, A. M. Gok, S M Faisal, N. Hardavellas, S. Ogrenci-Memik and S. Parthasarathy. In Proceedings of the Design, Automation, and Test in Europe (DATE), Dresden, Germany, March 2016.

Towards Energy-Proportional Optical Interconnects. Y. Demir and N. Hardavellas. In Proceedings of the 2nd International Workshop on Optical/Photonic Interconnects for Computing Systems (OPTICS), Dresden, Germany, March 2016.
Invited Paper.

2015

Edge Importance Identification for Energy Efficient Graph Processing. S. M. Faisal, G. Tziantzioulis, A. M. Gok, S. Parthasarathy, N. Hardavellas and S. Ogrenci-Memik. In Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData), Santa Clara, CA, October 2015.

SCP: Synergistic Cache Compression and Prefetching. B. Patel, G. Memik and N. Hardavellas. In Proceedings of the 33rd IEEE International Conference on Computer Design (ICCD), New York City, NY, October 2015.

Parka: Thermally Insulated Nanophotonic Interconnects. Y. Demir and N. Hardavellas. In Proceedings of the 9th International Symposium on Networks-on-Chip (NOCS), Vancouver, Canada, September 2015.

b-HiVE: A Bit-Level History-Based Error Model with Value Correlation for Voltage-Scaled Integer and Floating Point Units. G. Tziantzioulis, A. M. Gok, S. M. Faisal, N. Hardavellas, S. Memik and S. Parthasarathy. In Proceedings of the Design Automation Conference (DAC), San Francisco, CA, June 2015.

Software: SoftInj, a software fault injection library that implements the b-HiVE error models.
Data Set: b-HiVE Hardware Characterization Dataset, a raw dataset of full-analog HSIM and SPICE simulations of industrial-strength 64-bit integer ALUs, integer multipliers, bitwise logic operations, FP adders, FP multipliers, and FP dividers from OpenSparc T1 across voltage domains, along with controlled value correlation experiments (2015).
The project website with links to released software and datasets is here.

Towards Energy-Efficient Photonic Interconnects. Y. Demir and N. Hardavellas. In Proceedings of SPIE, Optical Interconnects XV, San Francisco, CA, February 2015. Also selected to appear in SPIE Green Photonics.

2014

LaC: Integrating Laser Control in a Photonic Interconnect. Y. Demir and N. Hardavellas. In Proceedings of the IEEE Photonics Conference (IPC), pp. 28-29, La Jolla, CA, October 2014.

EcoLaser: An Adaptive Laser Control for Energy-Efficient On-Chip Photonic Interconnects. Y. Demir and N. Hardavellas. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), pp. 3-8, La Jolla, CA, August 2014.

Galaxy: A High-Performance Energy-Efficient Multi-Chip Architecture Using Photonic Interconnects. Y. Demir, Y. Pan, S. Song, N. Hardavellas, G. Memik and J. Kim. In Proceedings of the ACM International Conference on Supercomputing (ICS), pp. 303-312, Munich, Germany, June 2014.

LaC: Integrating Laser Control in a Photonic Interconnect. Y. Demir and N. Hardavellas. Technical Report NU-EECS-14-03, Northwestern University, Evanston, IL, April 2014.

EcoLaser: An Adaptive Laser Control for Energy Efficient On-Chip Photonic Interconnects. Y. Demir and N. Hardavellas. Technical Report NU-EECS-14-02, Northwestern University, Evanston, IL, April 2014.

2013

The Impact of Dynamic Directories on Multicore Interconnects. M. Schuchhardt, A. Das, N. Hardavellas, G. Memik and A. Choudhary. IEEE Computer, Special Issue on Multicore Memory Coherence, Vol. 46(10), pp. 32-39, October 2013.

Galaxy: A High-Performance Energy-Efficient Multi-Chip Architecture Using Photonic Interconnects Y. Demir, Y. Pan, S. Song, N. Hardavellas, J. Kim and G. Memik. Technical Report NU-EECS-13-08, Northwestern University, Evanston, IL, July 2013.

2012

Towards a Schlieren Camera. B. Pattabiraman>, R. Morton, A. Grabenhofer, N. Hardavellas, J. Tumblin and V. Gopal. In 8th Annual Mid-West Graphics Workshop (MIDGRAPH), Chicago, IL, December 2012.

Load Balancing for Processing Spatio-Temporal Queries in Multi-Core Settings. A. Yaagoub, G. Trajcevski, P. Scheuermann and N. Hardavellas. In 11th International ACM Workshop on Data Engineering for Wireless and Mobile Access (MobiDE), co-located with ACM SIGMOD International Conference on Management of Data and ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (ACM SIGMOD/PODS), Scottsdale, AZ, May 2012.

The Rise and Fall of Dark Silicon N. Hardavellas. USENIX ;login:, Vol. 37, No. 2, pp. 7-17, April 2012.
Invited Paper.

Dynamic Directories: Reducing On-Chip Interconnect Power in Multicores. A. Das, M. Schuchhardt, N. Hardavellas, G. Memik and A. Choudhary. In Proceedings of Design, Automation, and Test in Europe (DATE), pp. 479-484, Dresden, Germany, March 2012.

2011

Elastic Fidelity: Trading-off Computational Accuracy for Energy Reduction. S. Roy, T. Clemons, S. M. Faisal, K. Liu, N. Hardavellas and S. Parthasarathy. Technical Report NWU-EECS-11-02, Northwestern University, Evanston, IL, February 2011. Indexed at arXiv:1111.4279 [cs.AR], November 2011.

Toward Dark Silicon in Servers. N. Hardavellas, M. Ferdman, B. Falsafi and A. Ailamaki. IEEE Micro, Special Issue on Big Chips, Vol. 31(4), pp. 6-15, July/August 2011. Also, IEEE Micro Spotlight Paper at Computing Now, February 2012.

Exploiting Dark Silicon for Energy Efficiency N. Hardavellas. NSF Workshop on Sustainable Energy-Efficient Data Management (SEEDM), National Science Foundation, Arlington, VA, USA, May 2011.

Elastic Fidelity: Trading-off Computational Accuracy for Energy Reduction. S. Roy, T. Clemons, S. M. Faisal, K. Liu, N. Hardavellas and S. Parthasarathy. In 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Newport Beach, California, March 2011 (poster).

Hardware/Software Techniques for DRAM Thermal Management. S. Liu, B. Leung, A. Neckar, S. Ogrenci-Memik, G. Memik and N. Hardavellas. In Proceedings of the 17th IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 479-484, San Antonio, Texas, February 2011.

2010

PAD: Power-Aware Directory Placement in Distributed Caches. A. Das, M. Schuchhardt, N. Hardavellas, G. Memik and A. Choudhary. Technical Report NWU-EECS-10-11, Northwestern University, Evanston, IL, December 2010.

Exploring Benefits and Designs of Optically-Connected Disintegrated Processor Architecture. Y. Pan, Y. Demir, N. Hardavellas, J. Kim and G. Memik. In Workshop on the Interaction between Nanophotonic Devices and Systems (WINDS), co-located with the 43rd International Symposium on Microarchitecture (MICRO), Atlanta, GA, December 2010.

Data-Oriented Transaction Execution. I. Pandis, R. Johnson, N. Hardavellas and A. Ailamaki. Proceedings of the VLDB Endowment (PVLDB), Vol. 3(1), pp. 928-939, August 2010.

Data-Oriented Transaction Execution. I. Pandis, R. Johnson, N. Hardavellas and A. Ailamaki. 9th Hellenic Data Management Symposium (HDMS), Ayia Napa, Cyprus, July 2010.

The Path Forward: Specialized Computing in the Datacenter. N. Hardavellas, M. Ferdman, A. Ailamaki and B. Falsafi. In 2nd Workshop on Architectural Considerations for Large Datacenters (ACLD), co-located with the 37th ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), Saint-Malo, France, June 2010.

Power Scaling: the Ultimate Obstacle to 1K-Core Chips. N. Hardavellas, M. Ferdman, A. Ailamaki and B. Falsafi. Technical Report NWU-EECS-10-05, Northwestern University, Evanston, IL, March 2010.

Near-Optimal Cache Block Placement with Reactive Nonuniform Cache Architectures. N. Hardavellas, M. Ferdman, B. Falsafi and A. Ailamaki. IEEE Micro, Vol. 30(1), pp. 20-28, January/February 2010.
IEEE Micro Top Picks from Computer Architecture Conferences.

Data-Oriented Transaction Execution. I. Pandis, R. Johnson, N. Hardavellas and A. Ailamaki. Technical Report CMU-CS-10-101, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, January 2010.

2009

Reactive NUCA: Near-Optimal Block Placement and Replication in Distributed Caches. N. Hardavellas, M. Ferdman, B. Falsafi and A. Ailamaki. In Proceedings of the 36th ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), pp. 184-195, Austin, TX, June 2009.
IEEE Micro Top Picks from Computer Architecture Conferences.

Shore-MT: A Scalable Storage Manager for the Multicore Era. R. Johnson, I. Pandis, N. Hardavellas, A. Ailamaki and B. Falsafi. In Proceedings of the 12th International Conference on Extending Database Technology (EDBT), pp. 24-35, Saint-Petersburg, Russia, March 2009.
Test-of-Time Award, 2019.
Software: Shore-MT, a scalable storage manager for the multicore era.

Operator-Level Parallelism. N. Hardavellas and I. Pandis. Encyclopedia of Database Systems, pp. 1981-1985, L. Liu and M. T. (Eds.), ISBN 978-0-387-35544-3, Springer, 2009.

Execution Skew. N. Hardavellas and I. Pandis. Encyclopedia of Database Systems, pp. 1079, L. Liu and M. T. (Eds.), ISBN 978-0-387-35544-3, Springer, 2009.

Inter-Query Parallelism. N. Hardavellas and I. Pandis. Encyclopedia of Database Systems, pp. 1566-1567, L. Liu and M. T. (Eds.), ISBN 978-0-387-35544-3, Springer, 2009.

Intra-Query Parallelism. N. Hardavellas and I. Pandis. Encyclopedia of Database Systems, pp. 1567-1568, L. Liu and M. T. (Eds.), ISBN 978-0-387-35544-3, Springer, 2009.

Stop-and-Go Operator. N. Hardavellas and I. Pandis. Encyclopedia of Database Systems, pp. 2794, L. Liu and M. T. (Eds.), ISBN 978-0-387-35544-3, Springer, 2009.

2008

R-NUCA: Data Placement in Distributed Shared Caches. N. Hardavellas, M. Ferdman, B. Falsafi and A. Ailamaki. Technical Report CALCM-TR-2008-001, Computer Architecture Lab, Carnegie Mellon University, Pittsburgh, PA, December 2008.

Shore-MT: A Quest for Scalability in the Many-Core Era. R. Johnson, I. Pandis, N. Hardavellas and A. Ailamaki. Technical Report CMU-CS-08-114, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 2008.

To Share Or Not To Share?. R. Johnson, N. Hardavellas, I. Pandis, N. Mancheril, S. Harizopoulos, K. Sabirli, A. Ailamaki and B. Falsafi. 7th Hellenic Data Management Symposium (HDMS), Heraklion, Crete, Greece, July 2008.

2007

Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding. J. Kim, N. Hardavellas, K. Mai, B. Falsafi and J. C. Hoe. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 197-209, Chicago, IL, December 2007.

To Share Or Not To Share?. R. Johnson, N. Hardavellas, I. Pandis, N. Mancheril, S. Harizopoulos, K. Sabirli, A. Ailamaki and B. Falsafi. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB), pp. 351-362, Vienna, Austria, September 2007.

An Analysis of Database System Performance on Chip Multiprocessors. N. Hardavellas, I. Pandis, R. Johnson, N. Mancheril, S. Harizopoulos, A. Ailamaki and B. Falsafi. 6th Hellenic Data Management Symposium (HDMS), Athens, Greece, July 2007.

Scheduling Threads for Constructive Cache Sharing on CMPs. S. Chen, P. B. Gibbons, M. Kozuch, V. Liaskovitis, A. Ailamaki, G. E. Blelloch, B. Falsafi, L. Fix, N. Hardavellas, T. C. Mowry and C. Wilkerson. In Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 105-115, San Diego, CA, June 2007.

Database Servers on Chip Multiprocessors: Limitations and Opportunities. N. Hardavellas, I. Pandis, R. Johnson, N. Mancheril, A. Ailamaki and B. Falsafi. In Proceedings of the 3rd Biennial Conference on Innovative Data Systems Research (CIDR), pp. 79-87, Asilomar, CA, January 2007.

2006

An Analysis of Database System Performance on Chip Multiprocessors. N. Hardavellas, I. Pandis, R. Johnson, N. Mancheril, S. Harizopoulos, A. Ailamaki and B. Falsafi. Technical Report CMU-CS-06-153, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 2006.

Parallel Depth First vs. Work Stealing Schedulers on CMP Architectures. V. Liaskovitis, S. Chen, P. B. Gibbons, A. Ailamaki, G. E. Blelloch, B. Falsafi, L. Fix, N. Hardavellas, M. Kozuch, T. C. Mowry and C. Wilkerson. In Proceedings of the 18th Annual ACM International Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 330, Cambridge, MA, August 2006.

Simultaneous Pipelining in QPipe: Exploiting Work Sharing Opportunities Across Queries. D. Dash, K. Gao, N. Hardavellas, S. Harizopoulos, R. Johnson, N. Mancheril, I. Pandis, V. Shkapenyuk and A. Ailamaki. Demonstration, In Proceedings of the 22nd International Conference on Data Engineering (ICDE), Atlanta, GA, April 2006.
Best Demonstration Award.

2005

Store-Ordered Streaming of Shared Memory. T. F. Wenisch>, S. Somogyi, N. Hardavellas, J. Kim, C. Gniady, A. Ailamaki and B. Falsafi. In Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 75-86, Saint Louis, MO, September 2005.

Temporal Streaming of Shared Memory. T. F. Wenisch, S. Somogyi, N. Hardavellas, J. Kim, A. Ailamaki and B. Falsafi. In Proceedings of the 32nd ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), pp. 222-233, Madison, WI, June 2005.

2004

SORDS: Just-In-Time Streaming of Temporally-Correlated Shared Data. T. Wenisch, S. Somogyi, N. Hardavellas, J. Kim, C. Gniady, A. Ailamaki and B. Falsafi. Technical Report CALCM-TR-2004-002, Computer Architecture Lab, Carnegie Mellon University, Pittsburgh, PA, November 2004.

Memory Coherence Activity Prediction in Commercial Workloads. S. Somogyi, T. F. Wenisch, N. Hardavellas, J. Kim, A. Ailamaki and B. Falsafi. 3rd Workshop on Memory Performance Issues (WMPI), pp. 37-45, Munich, Germany, June 2004.

SimFlex: a Fast, Accurate, Flexible Full-System Simulation Framework for Performance Evaluation of Server Architecture. N. Hardavellas, S. Somogyi, T. F. Wenisch, R. E. Wunderlich, S. Chen, J. Kim, B. Falsafi, J. C. Hoe and A. Nowatzyk. ACM SIGMETRICS Performance Evaluation Review (PER) Special Issue on Tools for Computer Architecture Research, Vol. 31(4), pp. 31-35, March 2004.
Software: Flexus, a scalable, full-system, cycle-accurate simulation framework of multicore and multiprocessor systems.

2003 and prior


Adaptive Dirty-Block Purging. S. C. Steely Jr. and N. Hardavellas. U.S. patent 6,493,801, December 2002.

Apparatus and Method for Maintaining Data Coherence Within a Cluster of Symmetric Multiprocessors L. I. Kontothanassis, M. L. Scott, N. Hardavellas, G. C. Hunt, R. J. Stets and S. Dwarkadas. U.S. patent 6,341,339, January 2002.

The Implementation of Cashmere R. J. Stets, D. Chen, S. Dwarkadas, N. Hardavellas, G. C. Hunt, L. Kontothanassis, G. Magklis, S. Parthasarathy, U. Rencuzogullari and M. L. Scott. Technical Report TR 723, Computer Science Department, University of Rochester, Rochester, NY, December 1999.

Cashmere-VLM: Remote Memory Paging for Software Distributed Shared Memory. S. Dwarkadas, N. Hardavellas, L. Kontothanassis, R. Nikhil and R. Stets. In Proceedings of the 13th IEEE/ACM International Parallel Processing Symposium (IPPS), pp. 153-159, San Juan, Puerto Rico, April 1999.

Software Cache Coherence with Memory Scaling. N. Hardavellas, L. Kontothanassis, R. Nikhil and R. J. Stets. 7th Workshop on Scalable Shared Memory Multiprocessors (SSMM), Barcelona, Spain, June 1998.

Understanding the Performance of DSM Applications. W. Meira Jr., T. J. LeBlanc, N. Hardavellas and C. Amorim. Communication and Architectural Support for Network-Based Parallel Computing (CANPC), D. Panda and C. Stunkel Eds., Lecture Notes in Computer Science, Vol. 1199/1997, pp. 198-211, Springer Berlin/Heidelberg, February 1997, DOI: 10.1007/3-540-62573-9_15.

Cashmere-2L: Software Coherent Shared Memory on a Clustered Remote-Write Network. R. J. Stets, S. Dwarkadas, N. Hardavellas, G. C. Hunt, L. Kontothanassis, S. Parthasarathy and M. L. Scott. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP), pp. 170-183, Saint Malo, France, October 1997.

VM-Based Shared Memory on Low-Latency, Remote-Memory-Access Networks. L. Kontothanassis, G. C. Hunt, R. J. Stets, N. Hardavellas, M. Cierniak, S. Parthasarathy, W. Meira Jr., S. Dwarkadas and M. L. Scott. In Proceedings of the 24th ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), pp. 157-169, Denver, CO, June 1997.

Efficient Use of Memory Mapped Interfaces for Shared Memory Computing. N. Hardavellas, G. C. Hunt, S. Ioannidis, R. J. Stets, S. Dwarkadas, L. Kontothanassis and M. L. Scott. In IEEE CS Technical Committee on Computer Architecture (TCCA) Special Issue on Distributed Shared Memory, pp. 28-33, March 1997.

VM-Based Shared Memory on Low-Latency, Remote-Memory-Access Networks. L. Kontothanassis, G. C. Hunt, R. J. Stets, N. Hardavellas, M. Cierniak, S. Parthasarathy, W. Meira Jr, S. Dwarkadas and M. L. Scott. Technical Report TR 643, Computer Science Department, University of Rochester, Rochester, NY, November 1996.

The Implementation of Cashmere. M. L. Scott, W. Li, L. Kontothanassis, G. C. Hunt, M. Michael, R. J. Stets, N. Hardavellas, W. Meira Jr., A. Poulos, M. Cierniak, S. Parthasarathy and M. Zaki. 6th Workshop on Scalable Shared Memory Multiprocessors (SSMM), Boston, MA, October 1996.

Contention in Counting Networks. C. Busch, N. Hardavellas and M. Mavronicolas. In Proceedings of the 13th ACM Annual Symposium on Principles of Distributed Computing (PODC), Los Angeles, CA, August 1994.

Notes on Sorting and Counting Networks. N. Hardavellas, D. Karakos and M. Mavronicolas. Distributed Algorithms (WDAG), A. Schiper Ed., Lecture Notes in Computer Science, Vol. 725/1993, pp. 234-248, Springer Berlin/Heidelberg, September 1993, DOI: 10.1007/3-540-57271-6_39.

Notes on Sorting and Counting Networks. N. Hardavellas, D. Karakos and M. Mavronicolas. Technical Report FORTH-ICS/TR-092, Institute of Computer Science, Foundation for Research and Technology - Hellas, Heraklion, Crete, Greece, July 1993.

Artifacts

Software

TPAL: a task-parallel assembly language for heartbeat scheduling that dramatically reduces the overheads of parallelism without compromising scalability.
Please cite as follows: Task Parallel Assembly Language for Uncompromising Parallelism. M. Rainey, P. Dinda, K. Hale, R. Newton, U. A. Acar, N. Hardavellas, S. Campanoni. 42nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2021.

SoftInj: a software fault injection library that implements the b-HiVE error models.
Please cite as follows: b-HiVE: A Bit-Level History-Based Error Model with Value Correlation for Voltage-Scaled Integer and Floating Point Units. G. Tziantzioulis, A. M. Gok, S. M. Faisal, N. Hardavellas, S. Memik and S. Parthasarathy. In Proceedings of the Design Automation Conference (DAC), San Francisco, CA, June 2015.

Shore-MT, a scalable storage manager for the multicore era.
Test-of-Time Award, EDBT 2019.
Please cite as follows: Shore-MT: A Scalable Storage Manager for the Multicore Era. R. Johnson, I. Pandis, N. Hardavellas, A. Ailamaki and B. Falsafi. In Proceedings of the 12th International Conference on Extending Database Technology (EDBT), pp. 24-35, Saint-Petersburg, Russia, March 2009.

Flexus, a scalable, full-system, cycle-accurate simulation framework of multicore and multiprocessor systems.
Please cite as follows: SimFlex: a Fast, Accurate, Flexible Full-System Simulation Framework for Performance Evaluation of Server Architecture. N. Hardavellas, S. Somogyi, T. F. Wenisch, R. E. Wunderlich, S. Chen, J. Kim, B. Falsafi, J. C. Hoe and A. Nowatzyk. ACM SIGMETRICS Performance Evaluation Review (PER) Special Issue on Tools for Computer Architecture Research, Vol. 31(4), pp. 31-35, March 2004.

Data Sets

b-HiVE Hardware Characterization Dataset: a raw dataset of full-analog HSIM and SPICE simulations of industrial-strength 64-bit integer ALUs, integer multipliers, bitwise logic operations, FP adders, FP multipliers, and FP dividers from OpenSparc T1 across voltage domains, along with controlled value correlation experiments.
Please cite as follows: b-HiVE: A Bit-Level History-Based Error Model with Value Correlation for Voltage-Scaled Integer and Floating Point Units. G. Tziantzioulis, A. M. Gok, S. M. Faisal, N. Hardavellas, S. Memik and S. Parthasarathy. In Proceedings of the Design Automation Conference (DAC), San Francisco, CA, June 2015.

Presentations

Honors & Awards


Test-of-Time Award, International Conference on Extending Database Technology (EDBT), 2019, Shore-MT: A Scalable Storage Manager for the Multicore Era. R. Johnson, I. Pandis, N. Hardavellas, A. Ailamaki and B. Falsafi. Article originally appeared in EDBT 2009.

Royal E. Cabell Fellowship, Northwestern University, 2019. Michael Wilkins.

Terminal Year Fellowship, Northwestern University, 2017. George Tziantzioulis.

Best Ph.D. Dissertation Award in Computer Engineering, Northwestern University, 2016. High-Performance and Energy-Efficient Computer System Design Using Photonic Interconnects. Yigit Demir.

NSF CAREER Award, The National Science Foundation (NSF), CISE:CCF:SHF, 2015. Energy-Efficient and Energy-Proportional Silicon-Photonic Manycore Architectures. Nikos Hardavellas.

Royal E. Cabell Fellowship, Northwestern University, 2015. Haiyang (Drake) Han.

Best Computer Engineering Poster Award, EECS Fair, Northwestern University, 2015. b-HiVE: A Bit-Level History-Based Error Model with Value Correlation for Voltage-Scaled Integer and Floating Point Units. Georgios Tziantzioulis, Ali Murat Gok and Nikos Hardavellas.

Second Computer Engineering Poster Award, EECS Fair, Northwestern University, 2015. Towards Energy-Efficient Photonic Interconnects. Yigit Demir and Nikos Hardavellas.

Second Computer Engineering Poster Award, EECS Fair, Northwestern University, 2014. EcoLaser: Adaptive Laser Control for Energy Efficient On-Chip Photonic Interconnects. Yigit Demir and Nikos Hardavellas.

Third EECS Poster Award, EECS Fair, Northwestern University, 2013. Galaxy: Pushing the Power and Bandwidth Walls with Optically-Connected Disintegrated Processors. Yigit Demir and Nikos Hardavellas.

Fellow, Searle Center for Teaching Excellence, 2012, Northwestern University. Nikos Hardavellas.

IEEE Micro Spotlight Paper, February 2012. Toward Dark Silicon in Servers. N. Hardavellas, M. Ferdman, B. Falsafi and A. Ailamaki. Article originally appeared in IEEE Micro Special Issue on Big Chips, July/August 2011.

Morrison Fellowship, Northwestern University, 2011. Yigit Demir.

Undergraduate Research Award, Northwestern University, 2011. Sourya Roy.

Keynote Talk, 9th International Symposium on Parallel and Distributed Computing (ISPDC), 2010. When Core Multiplicity Doesn't Add Up. Nikos Hardavellas.

IEEE Micro Top Picks from Computer Architecture Conferences, 2010. Near-Optimal Cache Block Placement with Reactive Nonuniform Cache Architectures. N. Hardavellas, M. Ferdman, B. Falsafi and A. Ailamaki. The Top Picks awards recognize "the year's most significant research papers in computer architecture based on novelty and long-term impact" across all computer architecture conferences.

Undergraduate Research Award, Northwestern University, 2010. Eric Anger.

June and Donald Brewer Chair, 2009-2011. Northwestern University. Nikos Hardavellas.

Best Demonstration Award, 22nd IEEE International Conference on Data Engineering (ICDE), 2006. Simultaneous Pipelining in QPipe: Exploiting Work Sharing Opportunities Across Queries. D. Dash, K. Gao, N. Hardavellas, S. Harizopoulos, R. Johnson, N. Mancheril, I. Pandis, V. Shkapenyuk and A. Ailamaki.

Nation Merit Scholarship, Northwestern University, 2006. Mathew Lowes.

Technical Award for Contributions to the Alpha Microprocessor, 2000. Compaq Computer Corporation, Marlborough, MA. Nikos Hardavellas.

FORTH Fellowship, 1993-1995. Foundation for Research and Technology - Hellas (FORTH), Greece. Nikos Hardavellas.

Funding



Argonne National Laboratory subcontract. Exploring Machine Learning-based Approaches to Auto-tuning Distributed Memory Communication. Peter A. Dinda and Nikos Hardavellas, 2021–2022



NSF SPX-2028851. Collaborative Research: PPoSS: Planning: Unifying Software and Hardware to Achieve Performant and Scalable Zero-cost Parallelism in the Heterogeneous Future. Peter A. Dinda, Nikos Hardavellas, Simone Campanoni, Umut Acar (CMU), Michael Rainey (CMU), Kyle C. Hale (IIT), 2020–2021

NSF CSR-1763743. Collaborative Research: Interweaving the Parallel Software/Hardware Stack. Peter A. Dinda, Simone Campanoni, Nikos Hardavellas, Kyle C. Hale (IIT), 2018–2022

NSF CCF-1453853. CAREER: Energy-Efficient and Energy-Proportional Silicon-Photonic Manycore Architectures. Nikos Hardavellas, 2015–2020

NSF CCF-1218768. SHF:Small:Collabroative Research: Elastic Fidelity: Trading-off Computational Accuracy for Energy Efficiency. Nikos Hardavellas, Seda Ogrenci-Memik, Srinivasan Parthasarathy (OSU), 2012–2015



ISEN, Booster Award. Toward Energy-Efficient Computing on Dark Silicon. Nikos Hardavellas, 2013–2014



Intel Parallel Computing Center. Nikos Hardavellas, Vadim Linetsky, Diego Klabjan, Jeremy C. Staum



Allinea Performance Analysis Software License Donation, 2015–2016



Synopsys, Semiconductor IP License Donation, 2010–2015



Cadence, Tensilica XTensa Processor Generator Software License Donation, 2013–2015



Mentor Graphics, FloTHERM/Icepack Software License Donation, 2012–2015



Windriver, Simics Software License Donation, 2009–2015

People

Faculty

Nikos Hardavellas, Associate Professor, CS & ECE

Ph.D. Students

Haiyang (Drake) Han
Vijay Kandiah
Michael Wilkins (co-advised with Peter Dinda)
Lixu Wang (co-advised with Qi Zhu)

Undergraduate Students

Dave Washington

Alumni (Ph.D.)

Ali Murat Gok
Ph.D. December 2018. Energy-Efficient Computing through Approximate Arithmetic.
First employment: Argonne National Laboratory, Mathematics and Computer Science Division.

George Tziantzioulis
Ph.D. June 2017. Harnessing Approximation for Energy- and Power-Efficient Computing.
First employment: Princeton University, Department of Electrical Engineering.

Yigit Demir
Ph.D. August 2015. High-Performance and Energy-Efficient Computer System Design Using Photonic Interconnects.
First employment: Intel, Computational Lithography Technology Group

Alumni (M.S.)

Gaurav Chaudhary
M.S. December 2020. A Simulator for Distributed Quantum Computing.
First employment: Apple

Benjamin Levinson
M.S. May 2019. Address Translation Performance Modeling.
First employment: Intel (Hillsboro, Oregon)

Vijay Kandiah
M.S. December 2017. The Impact of VaLHALLA Adders on GPUs.
First employment: Northwestern University (Ph.D.)

Zhenduo Zhai
M.S. December 2017. An Educational Tool for Multicore Design Space Analytic Modeling.
First employment: University of Missouri (Ph.D.)

Besnik Pashaj
M.S. August 2014. Performance and Power Analysis of Specialized Instruction Sets Processors.
First employment: Silicon Micro Display Inc.

Xinxin Huang
M.S. June 2013. The Impact of Process, Thermal Variations and Materials on Waveguide Losses.
First employment: Northwestern University (Enterprise Systems)

Bhargavraj Patel
M.S. June 2013. Exploring a Compressed Cache to Implement Efficient Hardware Prefetching in Multicore Processors.
First employment: Qualcomm

Ke Liu
M.S. December 2012. Hardware Error Rate Characterization with Below-Nominal Supply Voltages.
First employment: Intel CCDO (Hillsboro, Oregon)

Mathew Lowes
M.S. March 2011. A Feature Selection Framework for Data Prefetching.
First employment: Intel (Austin, Texas)

Alumni (Undergraduates)

Dave Washington
Project: Cache Allocation and Replacement Oracle.

Dana Wilson
B.S. June 2014. Project: Design for Dark Silicon.
First employment: Google

Marija Spaic
B.S. June 2013. Project: Design for Dark Silicon.
First employment: Peddinghaus Corporation

Sourya Roy
B.S. June 2011. Honors Thesis: Elastic Fidelity: Trading-off Computational Accuracy for Energy Reduction.
First employment: Keystone Strategy (later: Location Labs/AVG, Square, Stanford)

Eric Anger
B.S. June 2010. Project: Distributed Caches.
First employment: Georgia Tech (Ph.D.)

Elements

Text

This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5
Heading Level 6

Blockquote

Fringilla nisl. Donec accumsan interdum nisi.

Preformatted

i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;
    deck.shuffle();
    i++;
}

print 'It took ' + i + ' iterations to sort the deck.';

Lists

Unordered

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Alternate

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Ordered

  1. Dolor pulvinar etiam.
  2. Etiam vel felis viverra.
  3. Felis enim feugiat.
  4. Dolor pulvinar etiam.
  5. Etiam vel felis lorem.
  6. Felis enim et feugiat.

Icons

Actions

Table

Default

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Alternate

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Buttons

  • Disabled
  • Disabled

Form