Simulation And Evaluation
How can experiments be more systematic and comparable?
What is the State of the Art?
- Creation of "meaningful" distribution (e.g., Zipf distribution)?
- Use data from search engines -- didn't work too well
- It is possible, though. I will put a simple generator online when I get back home and correct the code - for now it is only ugly bash+awk. For the general idea you can take a look here: http://doi.acm.org/10.1145/1266894.1266939
- Use Planet``Lab for realistic processing and communication delays
- We need also meaningful real-life topologies. While one can use a plethora of available generators real life data is always applicable. As just one of possible starting points you could look here: http://www.opte.org/
- -> What are others doing?
What Are We Striving For?
- Have large data-sets for different scenarios
- Find models for distribution of events, subscriptions, etc. and extrapolate
- Define benchmarks based on the scenarios/models
- -> Anything important missing?
What can WE do?
- Publish data used for evaluation on the website!
- Publish workload generator/simulation used!
- Use pub/sub and produce our own data :)
- -> Has this already been done?
Where can we obtain realistic workloads and data sets?
- For a start:
- [http://www.nysedata.com NYSE Data] -> No subscription information
- [http://investopedia.com Investopedia]
- Use traces from peer-to-peer systems (e.g., Gnutella, traces from University of Washington)
- How can they be converted to be used for our purposes?
- Look at the benchmark suits currently developed there [UPDATE: Matteo]
- Information filtering/retrieval benchmarks (Annika?)
- Can we come up with a kind of game to gather data?
- Workload scheduling trace
- Intrusion detection systems may provide realistic data (e.g., Snort)
- The Gryphon project has data used in papers
- Ask TIBCO for data
- Use information from applications that build on pub/sub
- Ebay as a potential source (scrape data and publish)
- Use information from business processes
- Can we instrument games (like multiplayer games where characters subscribe to events in their neighborhood)?
- Can we use data, e.g., from [http://webcq.com WebCQ]
- Can we use information extracted from weblogs?
- PlanetLab as source for processing/communication delays?
- AOL Log data available here: http://www.gregsadetsky.com/aol-data/ If it goes off-line I can send you a CD (-- ZbigniewJerzak)
What Data Do We Need?
- How are subscriptions distributed/look like?
- Predicates, attributes, values
- What about composite events?
- How are publications distributed?
- Message rates?
- Subscriptions, publications, and meta-data
- Locality of interest?
- How does the topology look like?
- Broker degree, connectedness, communication delays, bandwidth
- How are clients joining and leaving the system?
- All this is depending on the application!
- -> Anything missing?
What benchmarks do exist or should exist?
- There is an EU project WASP with a work group on benchmarks called "Network-level benchmarks"
- There is work on EP application scenarios (cf. Dagstuhl Seminar) [UPDATE: Arno]
- "Application kernels" to modularize benchmark
- There is an EU project on benchmarks for EP (rule-based?) systems [UPDATE: Arno]
How have other communities developed and adopted benchmarks?
- There is a VLDB paper on benchmarks for SP
- Benchmarks for JMS (Alex Buchmann?)
- TPC / SPEC benchmarks
- Peer-to-Peer
- -> Are there any benchmarks driven by academia?
What are realistic models for workload generation?
What are good performance metrics?
I would like to draw the attention of the DEBS community to our paper titled "Constructing scalable overlay for pub-sub with many topics", which is published in PODC'07. The paper is available from http://www.ifi.uio.no/~romanvi/Papers/scalable-overlay-theory.ps This work is decidedly not about a new pub-sub system; it rather attempts to formally capture and theoretically analyze a fundamental problem of building and evaluating pub-sub overlays. Since many existing pub-sub systems have been tackling this problem from the practical standpoint, perhaps this paper can be considered a (rather small) step towards creating the unifying theory of pub-sub. Specifically, we believe that our work provides the following potential benefits for the DEBS community:
- It includes and can be further extended toward evaluation criteria for pub-sub overlays. This may be relevant for the effort of creating commonly used pub-sub benchmarks.
- It determines theoretical limits of what a practical pub-sub system designer should strive and can hope to achieve. In particular, it includes a nearly optimal centralized algorithm for building an overlay, which can be used as a baseline for distributed implementations in practice. The current paper version only targets topic-based pub-sub. Since this is a conference version limited in length, the list of references is very far from being comprehensive. In particular, we did not cite any major work on content-based pub-sub. We do intend to compile a comprehensive list of citations for the full version of this paper. This is an additional reason why feedback from the DEBS community would be so useful.
