Michael B. Taylor
Center for Dark Silicon
Bespoke Silicon Group
Paul Allen School of Computer Science and Engineering
Dept of Electrical Engineering
University of Washington, Seattle
CSE 564 (Paul Allen Center)
+1 use email
I am recruiting heavily for both graduate students and postdocs for my team at UW.
Interested PhD/MS students and Postdocs should apply to either or both departments and drop me a note (2 chances, 1 great advisor!)
I have been a professor in the Paul Allen School of Computer Science and
Engineering and Dept. of Electrical Engineering at the University of Washington, Seattle since September 2017.
I was a Visiting Research Scientist at Google and YouTube, working on datacenter accelerators,
and before that I was a tenured professor at the University of California San Diego Computer Science and Engineering Department from 2005 to 2016.
received a PhD in Electrical Engineering and Computer Science from MIT, and
my research centers around computer architecture but spans the stack from VLSI to compilers.
I was lead architect of the 16-core MIT Raw tiled
multicore processor, one of the earliest multicore processors, which was commercialized into the Tilera
TILE64 architecture. Recently, in 2017, Intel Skylake SP has adopted our scalable mesh of cores architecture that we proposed.
I co-authored the earliest published architecture research on dark silicon, the result of the end of Dennard scaling,
including a paper that derives the utilization wall that causes dark silicon,
and in that paper proposed specialization as the primary way this probably would be attacked.
To demonstrate these ideas, we proposed a prototype massively specialized processor called
GreenDroid, and in that work claimed that the
future chips would start incorporating exponentially growing numbers of accelerators, a hypothesis which
has come true for Apple iPhone chips, as shown in this innovative study by David Brooks's team at Harvard.
I also wrote a paper that establishes the definitive taxonomy, the Four Horsemen, for the semiconductor industry's approaches
to dealing with the problem, and a follow-on paper on the Landscape of the Dark Silicon Design Regime.
My research on dark silicon fed into the ITRS 2008 report that led Mike Mueller of ARM to coin the term "dark silicon".
More recently, I wrote the first academic paper on Bitcoin mining chips and in this paper proposed that
this was evidence of a totally new model for chip development: rather than the old model of traditional chip companies figuring and selling general-purpose
compute chips that were targeted to a wide audience and not specific to a particular application domain; we would have
"a new age of hardware innovation tailored to
emerging application domains—an Age of Bespoke Silicon". Effectively
their former customers would drive the creation of "bespoke silicon chips" for their specific application needs.
This prediction has since come true; every hyperscale datacenter company (i.e. Facebook, Alibaba, Google, Microsoft, ...) has their own domain specific chips under development. Immediately after that paper, from 2013-2016, we worked on the concept of ASIC Clouds. In 2016, my team published the first paper on ASIC Clouds.
We make the case that datacenters full of ASICs are in our near future, and show a prototypical
ASIC Cloud architecture, how they should be designed, and
how they save TCO. We proposed neural network ASIC Clouds
before Google announced their TPU, and
also proposed the use of video transcoding clouds for YouTube, an approach which Google adopted subsequent to the publication of our paper and disclosed in ASPLOS 2021.
This paper has a nice overview and retrospective of the work Communications of the ACM ASIC Cloud paper.
This paper was also selected to be a ACM Research Highlight; on average only two papers per year out of all of computer architecture publications
in the world!
In 2017, we published the first architecture paper on NRE, non-recurring engineering expense. We show how minimizing NRE can be more important for ASIC Cloud feasibility than optimizing accelerator speedup or energy efficiency. We present the first ever architect's model for NRE, using current industry parameters (paper) (youtube talk), and opening up a new area of research. With the rise of specialization and the end of Moore's law, driving down the cost of design will surely be an important driver of future research.
Also in 2017, we designed the architecture and wrote the RTL for the open source 511-core RISC-V compatible Celerity chip,
and together with Michigan taped it out in TSMC 16nm technology. The chip broke the world record for all silicon chips on the planet for Coremark performance
and also beat the world record for single-chip RISC-V chip performance by over 100X.
Returning the favor for Michigan, we taped out and brought up the Michigan OuterSpace sparse matrix chip, in TSMC 40nm technology.
More recently, at UW my team taped out and brought up two complex Global Foundries 12nm chips -- on the same day! This
is unprecedented for a university.
One is for the DARPA POSH BlackParrot RISC-V multicore project.
The other is for the DARPA SDH-funded HammerBlade RISC-V GP-GPU project.
Both designs are fully open source.
I occasionally help companies and other legal professionals evaluate their patent portfolios,
and provide advice to companies leveraging the Tilera TILE64 architecture, or that are developing cryptocurrency hardware. I have broad expertise in hardware and software, and on the Bitcoin cryptocurrency.
My research is funded primarily by the National Science Foundation (NSF), including the Secure and Trustworthy Cyberspace Program, and DARPA/SRC's C-FAR and ADA centers.
Between the gaps at school, I worked on Apple's NuKernel microkernel, and co-wrote the first version
of Connectix Virtual PC, an x86-to-PowerPC dynamic translation engine, which was acquired by Microsoft. I also
contributed to the ChipWrights Visual Signal Processor in its
I received the NSF CAREER Award in 2009 and tenure in 2012.
My research sponsors:
I direct the
UW Center for Dark Silicon and the Bespoke Silicon Group.
My colleagues and I were among the first to demonstrate
the existence of a utilization wall
which says that with the progression of Moore's Law, the percentage of a chip that we can actively use within a chip's power budget is dropping exponentially! The remaining silicon
that must be left unpowered is now referred to as Dark Silicon.
Our research on Conservation Cores and GreenDroid proposes new architectures that exploit dark silicon. Our paper on the The Four Horsemen (slides) overviews the landscape of architectural approaches to addressing dark silicon.
In addition to researching architectures for dark silicon, I look more broadly at sources of under-utilization
in current day chips, spanning from
a) power limitations because of poor CMOS scaling, b) overly large software engineering costs for parallelizing programs for multicore chips, and
c) lack of parallel application domains.
My research attacks each of these problems by 1) reinventing processor design to make use of dark silicon, 2) utilizing existing cores better through better parallel software engineering tools and 3) finding new parallel application classes to put cores to work:
The GreenDroid Mobile Applications Processor, which employs
Conservation Cores to fight dark silicon.
Our ASPLOS 2010 paper is one of the earliest
peer-reviewed architecture papers to have a cogent description of the utilization wall that
causes the Dark Silicon problem, and to propose specialization as an architectural solution.
||Our Hotchips 2010 work
GreenDroid: A Mobile Application Processor for a Future of Dark Silicon flushes out this proposal,
and is quite possibly the first published academic use of the term Dark Silicon. This was
followed up with
this March 2011 IEEE Micro paper. (Here is the Hotchips talk on youtube.) |
||Our work, Is Dark Silicon Useful? Harnessing the Four Horsemen of the Coming Dark Silicon Apocalypse, which appeared in DAC and DaSi 2012, and is followed by a paper in IEEE Micro 2013, is the first paper to overview the landscape of architectural approaches that try to address the dark silicon problem. We describe the four horsemen -- four approaches to dealing with dark silicon, each with deep-seated challenges but also unique capabilities. See the slides for a very entertaining presentation on the shrinking, dim, specialized, and deux ex machina horsemen.|
|| Kremlin, a tool that, given a serial program,
tells you which regions to parallelize.
To create Kremlin, we developed a novel dynamic analysis, hierarchical critical path analysis, to detect parallelism across nested regions of the program,
which connects to a parallelism planner which evaluates many potential parallelization to figure out the best way for the user to parallelize the target program.
|| the San Diego Vision Benchmark Suite, which distills the emerging computer vision application class into a collection
of nine benchmarks written in a research-friendly style. This work was co-advised by Prof. Serge Belongie, a member of UC San Diego's top-notch vision faculty. |
More recently, we have developed CortexSuite, which extends SD-VBS with a large suite of machine learning and other applications that traditionally the brain has been better at that computers.
- NoC Symbiosis
Daniel Petrisko, Chun Zhao, Scott Davidson, Paul Gao, Dustin Richmond and Michael Bedford Taylor.
in NOCS 2020.
- Ruche Networks: Wire-Maximal, No-Fuss NoCs
Dai Cheol Jung, Scott Davidson, Chun Zhao, Dustin Richmond, Michael Bedford Taylor.
in NOCS 2020
- ASIC Clouds: Specializing the Datacenter for Planet-Scale Applications. Michael Bedford Taylor, Luis Vega, Moein Khazraee, Ikuo Magaki, Scott Davidson, Dustin Richmond. In Communications of the ACM, July 2020. (pdf)(bib)
- BlackParrot: An Agile Open-Source RISC-V Multicore for Accelerator SoCs. Daniel Petrisko, Farzam gilani, Mark Wyse, Dai Cheol Jung, Scott Davidson, Paul Gao, Chun Zhao, Zahra Azad, Sadullah Canakci, Bandhav Veluri, Tavio Guarino, Ajay Joshi, Mark Oskin and Michael Bedford Taylor. In IEEE Micro, July/August 2020. (pdf)(bib)
- The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric.
by Scott Davidson, Shaolin Xie, Chris Torng, Khalid Al-Hawaj, Austin Rovinski, Tuto Ajayi, Luis Vega, Chun Zhao, Ritchie Zhao, Steve Dai, Aporva Amarnath, Bandhav Veluri, Paul Gao, Anuj Rao, Gai Liu, Rajesh K. Gupta, Zhiru Zhang, Ronald Dreslinski, Christopher Batten, Michael Bedford Taylor.
IEEE Micro, March/April 2018. (pdf)(bib)
Proceedings of Hotchips, 2017. (pdf)(bib)
- BaseJump STL: SystemVerilog needs a Standard Template Library for Hardware Design.
Michael B. Taylor.
Design Automation Conference (DAC), June 2018. (pdf)(bib)(talk)
- The Evolution of Bitcoin Hardware.
This is a great overview of Bitcoin mining hardware evolution, a follow-on to my CASES 2013 paper, it updates that groundbreaking paper to 2017.
Michael Bedford Taylor.
IEEE Computer, Sept 2017.(pdf)(bib)
- Specializing a Planet's Computation: ASIC Clouds.
Read this to get a great overview of ASIC Clouds.
Moein Khazraee, Luis Vega, Ikuo Magaki and Michael Bedford Taylor.
IEEE Micro, May/June 2017.
- Moonwalk: NRE Optimization in ASIC Clouds or, accelerators will use old silicon.
Cite this for the first paper to give a detailed NRE model
and show how NRE can be optimized/evaluated for ASIC Clouds.
Moein Khazraee, Lu Zhang, Luis Vega, and Michael Bedford Taylor.
(paper) (youtube talk) (talk)(bib).
- ASIC Clouds: Specializing the Datacenter.
Cite this for the first paper that proposes ASIC Clouds
and defines the canonical ASIC Cloud architecture. We predicted the Google TPU before it was announced, and also forecast the deployment of video transcoding clouds, which Facebook announced in March 2019. Selected as a 2016 Top Picks Paper.
Ikuo Magaki, Moein Khazraee, Luis Vega Gutierrez, and Michael Bedford Taylor.
International Symposium on Computer Architecture (ISCA), June 2016.
(teaser talk; contains only part of paper)
(5/8/16 Tech Report)
- Bitcoin and The Age of Bespoke Silicon.
(Read this for the first-ever academic publication on Bitcoin mining hardware,
a stirring account of the Bitcoin mining community that
heralds a new age of
hardware innovation tailored to emerging application domains)
Michael Bedford Taylor
International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES), Sept 2013. (Talk) (Paper) (bib)
- A Landscape of the New Dark Silicon Design Regime.
IEEE Micro, Sep/Oct 2013. (pdf) (bib)
Design Automation and Test in Europe (DATE), April 2014. DATE 2014 talk.
Berkeley E3S Symposium, Oct 2013.
- Is Dark Silicon Useful?
Harnessing the Four Horsemen of the Coming Dark Silicon Apocalypse
(Cite this for first synthesis of approaches to attacking Dark Silicon.)
Michael B. Taylor
Design Automation Conference (DAC), June 2012. (pdf) (bib) (slides).
Also presented at the Dark Silicon Workshop (DaSi) 2012.
- Conservation Cores: Reducing the Energy of Mature Computations.
(Cite this for first peer-reviewed Utilization Wall & Dark Silicon Analysis.
Also for heterogeneity as a solution to dark silicon problem.)
Ganesh Venkatesh, John Sampson, Nathan Goulding, Saturnino Garcia, Slavik Bryskin, Jose Lugo-Martinez, Steven Swanson, and Michael Bedford Taylor.
Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2010. (pdf) (talk pdf, talk ppt) (bib)
- GreenDroid: A Mobile Application Processor for a Future of Dark Silicon.
(Cite this for dark silicon's impact on multicore scaling.
Also, for proposing the use of HLS-generated accelerators as a way to scale energy efficiency in the mobile space, a path that Apple and Qualcomm have since taken.)
Nathan Goulding, Jack Sampson, Ganesh Venkatesh, Saturnino Garcia, Joe Auricchio, Jonathan Babb, Michael Bedford Taylor and Steven Swanson.
Proceedings of HOTCHIPS, August 2010. (pdf) (talk ppt) (bib) (youtube)
- The GreenDroid Mobile Application Processor: An Architecture for Silicon's Dark Future.
Nathan Goulding-Hotta, Jack Sampson, Ganesh Venkatesh, Saturnino Garcia, Joe Auricchio, Po-Chao Huang, Manish Arora, Siddhartha Nath, Jonathan Babb,
Steven Swanson, and Michael Bedford Taylor.
IEEE Micro, March 2011. (pdf) (bib)
- Kremlin: Rebooting and Rethinking gprof for the Multicore Age (Cite this for Kremlin.)
(aka Automatic Parallelism Planning and Discovery with Kremlin)
Saturnino Garcia, Donghwan Jeon, Chris Louie, and Michael Bedford Taylor.
Programming Language Design and Implementation (PLDI), June 2011. (pdf) (bib)
- SD-VBS: The San Diego Vision Benchmark Suite.
Sravanthi Kota Venkata, Ikkjin Ahn, Donghwan Jeon, Anshuman Gupta, Christopher Louie, Saturnino Garcia, Serge Belongie, and Michael Bedford Taylor.
IEEE International Symposium on Workload Characterization (IISWC), October 2009. (pdf) (Download SD-VBS) (bib)
- Evaluation of the Raw Microprocessor:
An Exposed-Wire-Delay Architecture for ILP and Streams
by Michael B Taylor, Walter Lee, Jason Miller, David Wentzlaff, Ian Bratt, Ben Greenwald, Henry Hoffmann, Paul Johnson, Jason Kim, James Psota, Arvind Saraf, Nathan Shnidman, Volker Strumpen, Matt Frank, Saman Amarasinghe, and Anant Agarwal.
Proceedings of the International Symposium on Computer Architecture (ISCA), June 2004. (pdf) (bib)
- A 16-issue multiple-program-counter microprocessor
with point-to-point scalar operand network,
by Michael B Taylor, Jason Kim, Jason Miller,
David Wentzlaff, Fae Ghodrat, Ben Greenwald, Henry Hoffman, Paul Johnson,
Walter Lee, Arvind Saraf, Nathan Shnidman, Volker Strumpen, Saman Amarasinghe,
and Anant Agarwal.
Proceedings of the IEEE International Solid-State
Circuits Conference (ISSCC), February 2003. (pdf) (bib)
- The Raw Microprocessor:
A Computational Fabric for Software Circuits and General Purpose Programs,
by Michael B Taylor, Jason Kim, Jason Miller, David Wentzlaff, Fae Ghodrat, Ben Greenwald, Henry Hoffman, Jae-Wook
Lee, Paul Johnson, Walter Lee, Albert Ma, Arvind Saraf, Mark Seneski, Nathan Shnidman, Volker Strumpen, Matt Frank, Saman Amarasinghe and Anant Agarwal.
IEEE Micro, March/April 2002. (pdf) (bib)
- Scalar Operand Networks,
by Michael B Taylor, Walter Lee, Saman Amarasinghe, and Anant Agarwal.
IEEE Transactions on Parallel and Distributed Systems (Special Issue on On-chip Networks) (TPDS), February 2005.
(Appendix pdf) (bib)
|April 2020||Coindesk writes an article leveraging the graphs from my Bitcoin mining evolution paper.
|April 2020||Closing quote in this Communications of the ACM article on RISC-V: Says Michael Taylor, an associate professor in the School of Computer Science and Engineering at the University of Washington in Seattle, "There are no serious technical or practical issues with RISC-V. It will eventually supplant x86 and ARM as the primary instruction set for microprocessors. It will fundamentally change the computing world."
|July 2019||Twin tapeouts in 12nm of the BlackParrot RISCV multicore chip and also the HammerBlade ML/Graph supercomputer chip! I think it's safe to say that no other university research group in the world has taped out two 12nm chips in a single day, and especially not of this complexity! ;-)
|May 2018||Released first version of Luis Vega's excellent Amazon F1 Accelerator Tutorial.
|Mar 2018||Received a Google Faculty Research Award! link.
|Feb 2018||Quoted in Bloomberg! link.
|Feb 2018||Received an Albaba AIR Research Award! link.
|Dec 2017||Interviewed by Wired for this article on Global Bitcoin mining energy usage.
|Aug 2017||Interviewed in this article about Bitmain (the primary Chinese bitcoin mining chip company) moving into ML datacenter chips, an act foreshadowed by our work on ASIC Clouds.
|Apr 2017||We taped out Celerity, a fully open source RISC-V chip with neural network accelerator that contains 511 RISC-V cores, in 16 nm TSMC technology. UCSD, Cornell and Michigan teamed together on this ambitious project and we completed the entire design and tapeout in less than a year. The architecture of this chip was published at Hotchips 2017, and was jointly presented by one student each from Cornell, Michigan, and my team, the Bespoke Silicon Group.
|Mar 2017||Check out my guest sigarch blog post on the Geocomputer Computer and the Commercial Borg. This is the lead article in a series of post by top computer architects across the planet! Local copy: pdf.
|Jan 2017||Sabbatical at Google in their datacenter accelerator group (until September)!
|Dec 2016||Our second ASIC tapeout, a ten-core manycore RISC-V processor, sent to the fab on Dec 22!
|Dec 2016|| Our ASIC Cloud ISCA paper was accepted as an IEEE Micro Top Pick. This means it was one of the 12 best computer architecture papers out of the hundreds published in 2016!
|Nov 2016|| Gave a talk at the fifth RISC-V workshop on some of our open source activities.
|Nov 2016|| Our ASPLOS paper was accepted. Congrats Moein, Lu, and Luis!
|Oct 2016||Our first ASIC tapeout, a high speed communications interface chip, sent to the fab!
|Jun 2016|| ASIC Cloud paper is out at ISCA. This paper is as big of a deal as our groundbreaking 2010 ASPLOS paper|
that showed the utilization wall that causes dark silicon, and proposed specialization as the solution.
We show how datacenters full of ASIC accelerators are the next step for computer architecture,
and show the entire TCO analysis, and a prototypical architecture and methodology for designing 4 kinds of ASIC clouds.
|Aug 2016||Invited Talk at DARPA / MTO CHIPS Workshop!
|Jun 2016||See my talk, on open source hardware at the Architecture 2030 workshop. (slides).
|May 2016||Tech report on ASIC Cloud released.
|Jan 2016||NSF has funded our $3M proposal, joint with Rosario Gennaro (CUNY), Abhi Shelat (UVa), Siddhartha Garg (NYU), Mariana Raykova (Yale), and my group (Bespoke Silicon Group). This is exciting!
|Sept 2015||Presented CortexSuite to DARPA StarNet SONIC center at UIUC!
|Sept 2015||Official release of Kremlin is finally out thanks to Prof. Sat Garcia!!
|Apr 29 2015||Final interview about possibilities for dark silicon in this article.
|Apr 14 2015||Interview about Dark Servers in this article in Semiconductor Engineering.
|Apr 9 2015||One on one interview about Dark Silicon in this article in Semiconductor Engineering.
|Feb 2015||MBT serves as General Chair of HPCA 2015!
|Feb 2015||Quoted discussing 14nm in this article in Semiconductor Engineering.
|Jan 2015||Our work on Dark Silicon and GreenDroid is extensively
discussed in this article at Semiconductor Engineering.
|Nov 2014||My work on Bitcoin referenced in this coindesk article on Moore's Law.
|Sep 2014||More of my work related to Bitcoin references in this coindesk article on immersion cooling with a few quotes from me.
|June 2014||My views on a DAC panel quoted and discussed in Brian Fuller's article in semiengineering.com on challenges for Computer Vision chips.
|June 2014|| Interviewed in this article in Bitcoin magazine.
|May 2014||Article in Korn Ferry Institute featuring my CASES paper.
|May 2014||This work on state attacks on Bitcoin employs the models in my 2013 CASES paper on Bitcoin mining.
|Mar 2014||My Bitcoin mining paper work referenced in Electronic Cooling Magazine.
|Mar 2014|| See my E3S Talk, entitled "A Landscape of the New Dark Silicon Regime". I gave a similar talk at DATE 2014 in Dresden, Germany.|
Here is some press in the EE Journal Blog.
|Jan 2014||Article in Data Center Knowledge discussing my Bitcoin CASES paper.
|Dec 2013|| I helped a reporter from the New York Times, Nathaniel Popper, as he deliberated on whether to fly to Iceland to meet a stranger and view their installation! He put together this great story on recent developments in Bitcoin mining. (I am quoted a few times.)|
Also quoted in a Venture Beat article.
|Oct 2013|| Because of some of my Bitcoin knowledge, a few newspapers, radio shows, and a TV show interviewed me about Bitcoin, Tor and the Silk Road takedown:
|Sep 2013|| I helped a reporter from the Wall Street Journal put together this front page article on Bitcoin mining.
|Sep 2013 || New CASES paper, Bitcoin and the Age of Bespoke Silicon.
|Sep 2013 || New IEEE Micro paper, A Landscape of the New Dark Silicon Design Regime.
|May 2013|| |
|My PhD student, Jack Sampson || -> tenure-track assistant professor @ Penn State.
|My PhD student, Saturnino Garcia || -> tenure-track assistant professor @ University of San Diego. |
|June 2012||I presented my paper
Is Dark Silicon Useful?|
Harnessing the Four Horsemen of the Coming Dark Silicon Apocalypse (slides) at DAC 2012 and DaSi 2012.
|November 2012||Quoted in this article in the November 2012 IEEE Computer Magazine on Exascale computing.
|May 2012||Quoted right at the beginning of this May 2012 IEEE Computer Magazine Article on Dark Silicon, right after Bill Dally! GreenDroid and Conservation Cores get a big shout-out for being a key approach for attacking the Dark Silicon problem.
|March 2011|| GreenDroid IEEE Micro article, The GreenDroid Mobile Application Processor: An Architecture for Silicon's Dark Future now available!
|Nov 2010|| UCSD ACM Programming Team, which I coach, invited to Worlds in Egypt!
|Aug 2010|| We present our GreenDroid mobile application processor design at Hotchips! Our Hotchips work was the only academic talk in the entire conference.|
Broad coverage in the media:
|May 2010|| HOTCHIPS paper on our C-core-based chip accepted: |
GreenDroid: A Mobile Application Processor for a Future of Dark Silicon.
First conference publication to have dark silicon in the title.
Nov 2009|| Just released The San Diego Vision Benchmark Suite, a benchmark for the vision application domain, written in MATLAB and clean C.
It's available at parallel.ucsd.edu/vision.
|Nov 2009|| Our paper, Conservation Cores: Reducing the Energy of Mature Computations, was accepted into ASPLOS.|
If you read one architecture paper this year, read this ASPLOS Paper. First paper that describes the utilization wall that is the
source of dark silicon, and proposes that heterogeneity is the answer.
|July 2009|| Successfully passed the FAA written test and landed an airplane four times at Long Beach Airport (LGB)!
|June 2009|| National Science Foundation CAREER Award: Energy-Efficient Parallel Architectures for Computer Vision.
Who came up with the term "dark silicon"?
The first use of the term in print was a quote by Bob Metcalfe in March 1997 in the IEEE Internet Computing magazine. However
he was referring to all of the sand in the world that has not yet been turned into chips!
The first mention of the term I've seen in its current context was by ARM CTO Mike Muller at ARM techcon in October 2009. I've heard other folks
say that the term was used by others in ARM and/or HiPeaC community earlier than then. Although ARM techcon happened
after we had submitted our ASPLOS paper in Aug 2009 that discussed the utilization wall,
we thought it was genius and decided to use the term in the title of our immediately following
Hotchips 2010 paper.
How did our group arrive at the utilization wall which causes Dark Silicon, and specialization as an approach to attacking it?
In 2003, I spent a few months reading 300+ ISSCC
and IEDM papers with the goal of comparing the (very different) IBM CMOS7SF
and Intel P858 fabrication processes as part of a Raw-versus-Pentium-3 section of the Raw ISCA paper I was working on.
I was also trying to understand VLSI scaling better so that we could make better proofs about Raw's optimality.
In 2004, I was trying to come up with some ideas for research as a faculty member.
I decided to analyze the scalability of multicore chips like Raw across process generations. Using skills picked up from the study
I did for the Raw ISCA paper,
I arrived at the conclusion that there was an exponentially worsening power issue with multicore scaling and that the problem
was the utilization wall and the dark silicon it creates. The analysis is the same as appears in our subsequent grant proposals and papers.
On the interview trail for faculty positions in 2005, I tried to sell the idea of the utilization wall
one-on-one with interviewing faculty and further proposed that the "ugly chip" (a massively heterogeneous design) was
a logical response. Most everybody didn't believe me or thought it was a terrible idea (James Hoe of CMU, to his credit,
thought it was interesting.).
In 2006, as brand new faculty members,
Steve Swanson and I cowrote a peer-reviewed 2006 NSF proposal that
outlined the utilization wall and created a plan for exploring massively heterogeneous solutions.
(Indeed, Steve named our analysis the
utilization wall, and already himself had a CAREER award on software aspects of heterogeneity.)
(Here is a April 2007 snapshot of our public website talking about the utilization wall.)
After one round
of rejection by peer review, the proposal was funded. The utilization wall
appears in the abstract of our NSF Award in July 2008.
many paper resubmissions, countless co-advising trials and tribulations, we
finally got the utilization wall in peer-reviewed academic literature in this March 2010 ASPLOS paper.
MIT Raw Processor
As one of the lead students in the
MIT Raw project, I led the design and implementation
of the Raw microprocessor, which targeted the leading VLSI technology of the time.
I also contributed heavily to almost all of the software systems that we built to support the
Raw was one of the earliest fabricated multicore processors, with 16 cores on a single die, back in 2002.
The purpose of Raw was to demonstrate architectural solutions to scalability problems
in modern day microprocessors. The Raw architecture exposes the transistor resources
of VLSI chips through the tile abstraction, the pin resources through the
I/O port abstraction, and the wiring resources
through on-chip networks. Raw was commercialized into the Tilera TILE64 architecture.
Because the Raw architecture exposed the on-chip resources more effectively than existing
sequential architectures (for instance the P6 micro-architecture, the basis of
the Pentium-M), Raw was able to outperform Intel desktop
processors, implemented with better process technology, across a variety of applications.
One of the key ideas that came out of the Raw research was the
formulation of the Scalar
Operand Network (SON), a unique class of sub-nanosecond network
responsible for routing operands between functional units and memories
in a distributed microprocessor.
My team implemented the 16-tile Raw microprocessor, shown to
the upper-left, in IBM's SA-27E 180 nm 6-layer Cu ASIC process. The
18.2 mm x 18.2 mm chip was, at least at the time, the largest design
that the IBM ASIC division had targeted for SA-27E. Each tile contains
computing power equivalent to a single-issue pipelined processor.
A supercomputer prototype, based on 4-chip boards, that scaled to 64 Raw chips (1024-issue) was
More pictures are available here.
Technology Policy Advocacy
Here are my notes about Getting Interviewed as an Expert on TV, based on my experiences getting interviewed by various newspapers, radio shows, and a live TV news program about Bitcoin and the Silk Road.
Testimony regarding Massachusetts House Bill No. 2743, entitled An Act to Improve Broadband and Internet Security,
Massachusetts Joint Committee On Criminal Justice on April 2, 2003.
This testimony was referenced by Ed Felton's Freedom to Tinker website and discussed
in a law journal article:
"Super-DMCA" Statutes: Putting Hollywood in Charge of Internet Business,
Matthew A. Verga, Wake Forest Intellectual Property Law Journal 104, May 2004.