Picture of Daniel Lemire

 is a full professor in computer science. His research is focused on software performance, indexing techniques and data science. For example, he works on bitmap indexes and integer-compression techniques. He also works on database design and probabilistic algorithms (e.g., universal hashing). He likes to reflect on technology and its effect on our civilization. 

His work on bitmap indexes is used by companies such as eBay, Facebook, LinkedIn and Netflix in their data warehouses within big-data frameworks such as Google Procella (YouTube's database engine), Apache HiveDruid,  Netflix Atlas, LinkedIn PinotApache Spark and Apache KylinGit, the ubiquitous version control system uses his techniques to accelerate queries. Some of his compression software is used by Apache Arrow and Apache Impala. His indexing techniques are used to accelerate medical research. Some of his techniques have been adopted by Apache Lucene, the search engine behind sites such as  Wikipedia and frameworks such as Solr and Elastic. One of his hashing techniques can be found in Google's Tensorflow where it improves performance. His Slope One algorithm is a reference in recommender systems. In 2019, his research was featured in Quebec Science magazine: L’intelligence artificielle pour rendre les logiciels plus rapides (Making Software Faster with AI); it was also the subject of an interview at the Années Lumières show (CBC), Des serveurs informatiques plus rapides et moins énergivores (Faster and More Efficient Servers). With his collaborators in 2019, he wrote the fastest JSON parser in the world (simdjson): JSON is the de facto standard data interchange format on the Internet. The simdjson library showed for the first time that is possible to parse JSON at gigabytes per second correctly and completly. His simdjson library is one of the top hundred most popular C++ projects of all time on GitHub. The simdjson library is used by Facebook, by Shopify, by Intel, by Microsoft, by Apache Doris and by several other important systems such as Node.js.

He has written over 90 peer-reviewed publications, including more than 50 journal articles. His scientific work has been cited over 5,000 times.  He has held competitive grants for over fifteen years. He regularly serves as an expert in prestigious program committees (e.g., ACM CIKM, ACM WSDM, ACM SIGIR, ACM RecSys). 

He works regularly on open source software libraries. He programs regularly in C, C++, Java, JavaScript, Python, Swift and Go. He works primarily in an open-source setting (e.g., Linux). In 2012, he was rewarded by the Google Open Source Peer Bonus Program. In February 2019, he was ranked in second position among the most popular developers on GitHub and most popular in C++ (ahead of Microsoft, Google, Facebook). In 2020, Daniel Lemire was one of the top 100 developers most followed on GitHub. The GitHub site has 28 million developers.

The algorithm described in his article Fast Random Integer Generation in an Interval was adopted to accelerate random number generation by the Swift standard library, by the Go language, by the Julia language, by the GNU C++ library (Linux) and by Numpy (Python). The algorithm described in his article Faster Base64 Encoding and Decoding using AVX2 Instructions is used within PHP. The algorithm described in his article Faster Remainder by Direct Computation: Applications to Compilers and Software Libraries is used within the C# standard library to accelerate the Dictionary class.

His accelerated number parser from the article Number Parsing at a Gigabyte per Second has been adopted by the C#, Go and Rust programming languages, Apache Arrow, Yandex ClickHouse, Apple's browser (Safari)' Microsoft LightGBM, and other major projects where it multiplied the number-parsing speed. The Go 1.16 release notes state that "ParseFloat now uses the Eisel-Lemire algorithm, improving performance by up to a factor of 2. This can also speed up decoding textual formats like encoding/json." The Rust 1.55 release notes state that the "standard library's implementation of float parsing has been updated to use the Eisel-Lemire algorithm, which brings both speed improvements and improved correctness". It is part of the LLVM libc standard library. It has also been adopted by Microsoft in C# as of .NET7.

Daniel Lemire and his colleagues wrote the Ada library which is maybe the fastest URL parser in the world. Ada improved the performance of the popular JavaScript runtime Node.js:

"Since Node.js 18, a new URL parser dependency was added to Node.js — Ada. This addition bumped the Node.js performance when parsing URLs to a new level. Some results could reach up to an improvement of 400%. As a regular user, you may not use it directly. But if you use an HTTP server then it’s very likely to be affected by this performance improvement." (State of Node.js Performance 2023)

He is a long-time social media user: his blog has tens of thousands of readers and was featured on slashdot, reddit and hacker news. He was one of the first Twitter users, he is followed by over 16,000 people : @lemire.

For years, he organized live talks open to the public in Montreal: tribalab and technolab. Following the Spring 2020 events, the talks have been moved mostly online.

Talks (YouTube)

NodeConf EU 2023

BID 2023

SPIRE 2021

Go Systems (San Francisco, 2020)

Performance Summmit III (Seattle, 2020)

QCon San Francisco 2019 (best voted talk!)

Spark Summit East 2017

Laboratory

We are lucky to have a fully equipped laboratory with a dedicated technician. We have a server farm that has been used worldwide for experiments in software performance (e.g., by researchers such as Agner Fog). Some of our machines have the following specifications:Microarchitecture Icelake : Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz

  • Icelake microarchitecture: Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz
  • Haswell microarchitecture: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
  • Knights Landing microarchitecture: Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz (64 cores)
  • Skylake microarchitecture: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
  • Skylake-X microarchitecture: Intel(R) Xeon(R) W-2104 CPU @ 3.20GHz
  • IBM POWER9 2.2 GHz, 4-core
  • Cannonlake microarchitecture: Intel Core i3-8121U CPU @2.20GHz
  • Skylark microarchitecture (ARMv8): Ampere eMAG CPU 32 cores @ 3.3 GHz

We also have several workstations and space in our laboratory to explore virtual reality as a tool in data science.

Students and post-doctoral fellows

We are recruiting students and postdoctoral fellows. If you love writing crazily fast software and want to come to Montreal, drop us a line. Link to an impressive GitHub profile is an asset. Speaking French is necessary if you want to pursue an academic program with me at the Université of Québec, except maybe at the Ph.D. level where allowances can be made for strong students.  Some of our best students are women. We offer scholarships for graduate studies in software performance for data engineering (in French).

If you are a Canadian undergraduate student with at least a B average, you might be interested in coming to work with Daniel Lemire under an NSERC Undergraduate Student Research Awards. The awards help pay for a full time research project in our Montreal labs. The application deadlines are:

  • March 1st for the Summer term;
  • July 1st for the Fall term;
  • November 1st for the Winter term.

It is an ongoing competition: we receive applications for every term. It is ok if you do not speak French. Please allow at least a week to put together an application with my help.

If you are interested in pursuing a master in information technology full-time under the supervision of Daniel Lemire in Montreal and you know some French, I take applications for NSERC Graduate Scholarships. You need to have a strong academic profile to apply. You should be a Canadian citizen or permanent resident of Canada. The deadline is December first of each year. You must plan ahead. We take applications every year.

If you are interested in pursuing a Ph.D. in cognitive computing full-time under the supervision of Daniel Lemire in Montreal and you know some French, we take applications for NSERC graduate scholarships. You need to have a strong academic profile to apply. You should be a Canadian citizen or permanent resident of Canada. The deadline is November 1st of each year. You must plan ahead. We take applications every year.

Moreover, all students finishing an M.Sc. thesis in information technology with us receive a scholarship, automatically. All students making progress on a doctorate in cognitive computer science receive automatic scholarships. Enrolment in a PhD program implies a waiver of tuition fees for foreign students. 

Daniel Lemire regularly supervises students, from the undergraduate to the Ph.D. level. He works primarily with students who love to program and who prefer an open source setting (e.g., Linux). Many of his students contribute to open-source projects on sites such as GitHub.

He recently supervised the following Ph.D. students: 

  • Pierre Marie Ntang, cognitive computer science (graduated in 2023);
  • Gary Germeil, cognitive computer science (graduated in 2022);
  • Tarek Khei, cognitive computer science (graduated in 2020);
  • Xueping Dai, environmental science (gradauted in 2019);
  • Erick Aokou Koffi, cognitive computer science (graduated in 2018);
  • Badis Merdaoui, cognitive computer science (graduated in 2017);
  • Jing Li, computer science (graduated in 2016);
  • Samy Chambi, computer science (graduated in 2016);
  • Hazel Webb, computer science (graduated in 2010).

Several of his students occupy key positions: e.g., Maxime Boisvert (M.Sc., 2017) is Production Engineering Manager at Shopify, Shany Carle (M.Sc., 2017) is computer science professor at cégep de Victoriaville, Shira Smith is engineer at Discord in Californie.

Education

  • Postdoctorate (Institute of Biomedical Engineering)
  • Engineering Mathematics Ph.D. (University of Montreal and Polytechnique Montréal)
  • Master in Mathematics (University of Toronto)
  • Bachelor degree in Mathematics (University of Toronto), with High Distinction

Research Interests

  • Data Science
  • Data Indexing
  • Data Engineering
  • Software Performance
  • Vectorization (SIMD)

Teaching

Research

Research program

We seek to accelerate software indexing techniques, either within search engines or within databases. In this work, we exploit recent and emerging hardware capabilities. In particular, we seek to fully benefit from vector instructions. To keep the memory close to the processor, we seek to improve index compression, whether they are inverted indexes, B-trees or bitmap indexes. We seek to uncompress data at great speed in RAM. We want to accelerate common operations such as intersections and unions.

Current research grants

Publications & Presentations

Journal articles (refereed)

Nizipli, Yagiz, & Lemire, Daniel (2024). Parsing Millions of URLs per Second. Software: Practice and Experience, 54 (5). https://doi.org/10.1002/spe.3296

Keiser, John, & Lemire, Daniel (In Press). On-Demand JSON: A Better Way to Parse Documents?. Software: Practice and Experience.

Clausecker, Robert, & Lemire, Daniel (2023). Transcoding Unicode Characters with AVX-512 Instructions. Software: Practice and Experience, 53 (12). https://doi.org/10.1002/spe.3261

Lemire, Daniel (In Press). Exact Short Products From Truncated Multipliers. Computer Journal. https://doi.org/10.1093/comjnl/bxad077

Mushtak, Noble, & Lemire, Daniel (2023). Fast Number Parsing Without Fallback. Software: Practice and Experience, 53 (7), 1467-1471. https://doi.org/10.1002/spe.3198

Graf, Thomas Mueller, & Lemire, Daniel (2022). Binary Fuse Filters: Fast and Smaller Than Xor Filters. Journal of Experimental Algorithmics, 27. https://doi.org/10.1145/3510449

Humeau, Tom; Savard, Isabelle; Lemire, Daniel; Dionne, Pierre-Olivier; Angulo Mendoza, Gustavo Adolfo; Plante, Patrick; Pinard, Anne Marie, & Roy, Jean-Sébastien (2022). FORCES 3 : Exploitation à des fins pédagogiques des données d’un portail d’apprentissage de l’autogestion de la douleur. Développement d’une architecture de collecte et d’analyse de données et d’un module de suivi du développement des compétences. Médiations et médiatisations (12), 74-97. https://doi.org/10.52358/mm.vi12.287

Lemire, Daniel, & Muła, Wojciech (2022). Transcoding Billions of Unicode Characters per Second with SIMD Instructions. Software: Practice and Experience, 52 (2).

Humeau, Tom; Savard, Isabelle; Dionne, Pierre-Olivier; Angulo-Mendoza, Gustavo; Plante, Patrick; Pinard, Anne Marie, & Lemire, Daniel (2022). FORCES 3 : Exploitation à des fins pédagogiques des données d’un portail d’apprentissage de l’autogestion de la douleur. Développement d’une architecture de collecte et d’analyse de données et d’un module de suivi du développement des compétences. Médiations & médiatisations (12), 74-97. https://doi.org/10.52358/mm.vi12.287

Klarqvist, Marcus D. R.; Muła, Wojciech, & Lemire, Daniel (2021). Efficient Computation of Positional Population Counts Using SIMD Instructions. Concurrency and Computation: Practice and Experience, 33 (17). https://doi.org/10.1002/cpe.6304

Lemire, Daniel; Bartlett, Colin, & Kaser, Owen (2021). Integer Division by Constants: Optimal Bounds. Heliyon, 7 (6). https://doi.org/10.1016/j.heliyon.2021.e07442

Keiser, John, & Lemire, Daniel (2021). Validating UTF-8 In Less Than One Instruction Per Byte. Software: Practice and Experience, 51 (5), 950-964. https://doi.org/10.1002/spe.2920

Lemire, Daniel (2021). Number Parsing at a Gigabyte per Second. Software: Practice and Experience, 51 (8). https://doi.org/10.1002/spe.2984

Lewis, François; Plante, Patrick, & Lemire, Daniel (2021). Pertinence, efficacité et principes pédagogiques de la réalité virtuelle et augmentée en contexte scolaire : une revue de littérature. Médiations & médiatisations (5), 11-27.

Graf, Thomas Mueller, & Lemire, Daniel (2020). Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters. Journal of Experimental Algorithmics, 25 (1). https://doi.org/10.1145/3376122

Muła, Wojciech, & Lemire, Daniel (2020). Base64 encoding and decoding at almost the speed of a memory copy. Software: Practice and Experience, 50 (2), 89-97. https://doi.org/10.1002/spe.2777

Lemire, Daniel; Kaser, Owen, & Kurz, Nathan (2019). Faster Remainder by Direct Computation: Applications to Compilers and Software Libraries. Software: Practice and Experience, 49 (6), 953-970. https://doi.org/10.1002/spe.2689

Dai, Xueping; Cheng, Li Zhen; Mareschal, Jean-Claude; Lemire, Daniel, & Liu, Chong (2019). New method for denoising borehole transient electromagnetic data with discrete wavelet transform. Journal of Applied Geophysics, 168, 41-48. https://doi.org/10.1016/j.jappgeo.2019.05.009

Lemire, Daniel (2019). Fast Random Integer Generation in an Interval. ACM Transactions on Modeling and Computer Simulation, 29 (1). https://doi.org/10.1145/3230636

Lemire, Daniel, & O'Neill, Melissa (2019). Xorshift1024*, Xorshift1024+, Xorshift128+ and Xoroshiro128+ Fail Statistical Tests for Linearity. Computational and Applied Mathematics, 350, 139-142. https://doi.org/10.1016/j.cam.2018.10.019

Langdale, Geoff, & Lemire, Daniel (2019). Parsing Gigabytes of JSON per Second. VLDB Journal, 28 (6). https://doi.org/10.1007/s00778-019-00578-5

Muła, Wojciech, & Lemire, Daniel (2018). Faster Base64 Encoding and Decoding Using AVX2 Instructions. ACM Transactions on the Web, 12 (3). https://doi.org/10.1145/3132709

Li, Jing; Yan, Yuhong, & Lemire, Daniel (2018). Full Solution Indexing for top-K Web Service Composition. IEEE Transactions on Services Computing, 11 (3), 521 - 533. https://doi.org/10.1109/TSC.2016.2578924

Lemire, Daniel; Kaser, Owen; Kurz, Nathan; Deri, Luca; O'Hara, Chris; Saint-Jacques, François, & Ssi-Yan-Kai, Gregory (2018). Roaring Bitmaps: Implementation of an Optimized Software Library. Software: Practice and Experience, 48 (4), 867–895. https://doi.org/10.1002/spe.2560

Lemire, Daniel; Kurz, Nathan, & Rupp, Christoph (2018). Stream VByte: Faster byte-oriented integer compression. Information Processing Letters, 130. https://doi.org/10.1016/j.ipl.2017.09.011

Muła, Wojciech; Kurz, Nathan, & Lemire, Daniel (2018). Faster population counts using AVX2 instructions. Computer Journal, 61 (1). https://doi.org/10.1093/comjnl/bxx046

Badia, Antonio, & Lemire, Daniel (2018). On Desirable Semantics of Functional Dependencies over Databases with Incomplete Information. Fundamenta Informaticae, 158 (4), 327-352. https://doi.org/10.3233/FI-2018-1651

Ivanchykhin, Dmytro; Ignatchenko, Sergey, & Lemire, Daniel (2017). Regular and almost universal hashing: an efficient implementation. Software: Practice and Experience, 47 (10). https://doi.org/10.1002/spe.2461

Lemire, Daniel, & Rupp, Christoph (2017). Upscaledb: Efficient Integer-Key Compression in a Key-Value Store using SIMD Instructions. Information Systems, 66, 13–23. https://doi.org/10.1016/j.is.2017.01.002

Lemire, Daniel; Ssi-Yan-Kai, Gregory, & Kaser, Owen (2016). Consistently faster and smaller compressed bitmaps with Roaring. Software: Practice and Experience, 46 (11), 1547-1569. https://doi.org/10.1002/spe.2402

Chambi, Samy; Lemire, Daniel, & Godin, Robert (2016). Vers de meilleures performances avec des Roaring bitmaps. Technique et Science Informatiques, 35 (3), 335-355.

Lemire, Daniel, & Boytsov, Leonid (2015). Decoding billions of integers per second through vectorization. Software: Practice & Experience, 45 (1), 1-29. https://doi.org/10.1002/spe.2203

Lemire, Daniel, & Kaser, Owen (2014). Strongly universal string hashing is fast. Computer Journal, 57 (11), 1624-1638. https://doi.org/10.1093/comjnl/bxt070

Webb, Hazel; Lemire, Daniel, & Kaser, Owen (2013). Diamond dicing. Data & Knowledge Engineering, 86. https://doi.org/10.1016/j.datak.2013.01.001

Lemire, Daniel; Kaser, Owen, & Gutarra, Eduardo (2012). Reordering rows for better compression: Beyond the lexicographic order. ACM Transactions on Database Systems, 37 (3). https://doi.org/10.1145/2338626.2338627

Zhu, Xiaodan; Turney, Peter; Lemire, Daniel, & Vellino, Andre (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66 (2), 408-427. https://doi.org/10.1002/asi.23179

Badia, Antonio, & Lemire, Daniel (2015). Functional dependencies with null markers. Computer Journal, 58 (5), 1160-1168. https://doi.org/10.1093/comjnl/bxu039

Kaser, Owen, & Lemire, Daniel (2006). Attribute value reordering for efficient hybrid OLAP. Information Systems, 176 (16), 2304-2336. https://doi.org/10.1016/j.ins.2005.09.005

Lemire, Daniel (2006). Streaming maximum-minimum filter using no more than three comparisons per element. Nordic Journal of Computing, 13 (4), 328-339.

Lemire, Daniel (2005). Scale and translation invariant collaborative filtering systems. Information Retrieval, 8 (1), 129-150. https://doi.org/10.1023/B:INRT.0000048492.50961.a6

Lemire, Daniel; Boley, Harold; McGrath, Sean, & Ball, Marc (2005). Collaborative filtering and inference rules for context-aware learning object recommendation. Interactive Technology and Smart Education, 2 (3). https://doi.org/10.1108/17415650580000043

Dubuc, Serge; Lemire, Daniel, & Merrien, Jean-Louis (2001). Fourier analysis of 2-point Hermite interpolatory subdivision schemes. Journal of Fourier Analysis and Applications, 7 (5), 532-552. https://doi.org/10.1007/BF02511225

Lemire, Daniel, & Kaser, Owen (2008). Hierarchical Bin Buffering: Online Local Moments for Dynamic External Memory Arrays. ACM Transactions on Algorithms, 4 (1), 1-31. https://doi.org/10.1145/1328911.1328925

Lemire, Daniel; Brooks, Martin, & Yan, Yuhong (2009). An optimal linear time algorithm for quasi-monotonic segmentation. International Journal of Computer Mathematics, 86 (7). https://doi.org/10.1080/00207160701694153

Lemire, Daniel (2009). Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recognition, 42 (9). https://doi.org/10.1016/j.patcog.2008.11.030

Lemire, Daniel, & Kaser, Owen (2010). Recursive n-gram hashing is pairwise independent, at best. Computer Speech & Language, 24 (4), 698-710. https://doi.org/10.1016/j.csl.2009.12.001

Lemire, Daniel; Kaser, Owen, & Aouiche, Kamel (2010). Sorting improves word-aligned bitmap indexes. Data & Knowledge Engineering, 69 (1), 3-28. https://doi.org/10.1016/j.datak.2009.08.006

Badia, Antonio, & Lemire, Daniel (2011). A call to arms: Revisiting database design. SIGMOD Record, 40 (3), 61-69. https://doi.org/10.1145/2070736.2070750

Lemire, Daniel, & Kaser, Owen (2011). Reordering Columns for Smaller Indexes. Information Sciences, 181 (12), 2550–2570. https://doi.org/10.1016/j.ins.2011.02.002

Lemire, Daniel (2012). The universality of iterated hashing over variable-length strings. Discrete Applied Mathematic, 160 (4-5), 604–617. https://doi.org/10.1016/j.dam.2011.11.009

Neylon, Cameron; Aerts, Jan; Brown, C. Titus; Coles, Simon J.; Hatton, Les; Lemire, Daniel; Millman, K. Jarrod; Murray-Rust, Peter; Perez, Fernando; Saunders, Neil; Shah, Nigam; Smith, Arfon; Varoquaux, Gaël, & Willighagen, Egon (2012). Changing computational research. The challenges ahead. Source Code for Biology and Medicine, 7 (2). https://doi.org/10.1186/1751-0473-7-2

Prekopcsák, Zoltán, & Lemire, Daniel (2012). Time Series Classification by Class-Specific Mahalanobis Distance Measures. Advances in Data Analysis and Classification, 6 (3). https://doi.org/10.1007/s11634-012-0110-6

Kaser, Owen, & Lemire, Daniel (2016). Compressed bitmap indexes: beyond unions and intersections. Software: Practice and Experience, 46 (2). https://doi.org/10.1002/spe.2289

Crainiceanu, Adina, & Lemire, Daniel (2015). Bloofi : Multidimensional Bloom Filters. Information Systems, 54. https://doi.org/10.1016/j.is.2015.01.002

Zhao, Wayne Xin; Zhang, Xudong; Lemire, Daniel; Shan, Dongdong; Nie, Jian-Yun; Yan, Hongfei, & Wen, Ji-Rong (2015). A General SIMD-based Approach to Accelerating Compression Algorithms. ACM Transactions on Information Systems, 33 (3). https://doi.org/10.1145/2735629

Lemire, Daniel; Boytsov, Leonid, & Kurz, Nathan (2016). SIMD Compression and the Intersection of Sorted Integers. Software: Practice and Experience, 46 (6).

Chambi, Samy; Lemire, Daniel; Kaser, Owen, & Godin, Robert (2016). Better bitmap performance with Roaring bitmaps. Software: Practice and Experience, 45 (5), 709–719. https://doi.org/10.1002/spe.2325

Lemire, Daniel, & Kaser, Owen (2016). Faster 64-bit universal hashing using carry-less multiplications. Journal of Cryptographic Engineering, 6 (3), 171-185. https://doi.org/10.1007/s13389-015-0110-5

Book chapters

Aouiche, Kamel; Lemire, Daniel, & Godin, Robert (2009). Web 2.0 OLAP: From data cubes to tag clouds. In Web Information Systems and Technologies. 4th International Conference, WEBIST 2008, Funchal, Madeira, Portugal, May 4-7, 2008, Revised Selected Papers. Springer, coll. « Lecture Notes in Business Information Processing », vol. 18.

Noël, Sylvie, & Lemire, Daniel (2010). On the Challenges of Collaborative Data Processing. In Foster, Jonathan (Ed.), Collaborative Information Behaviour. User Engagement and Communication Sharing (p. 55-71). IGI Global : IGI Global.

Papers in conference proceedings (refereed)

Miladi, Fatma; Lemire, Daniel, & Psyché, Valéry (In Press). Learning Engagement and Peer Learning in MOOC: A Selective Systematic Review. In 19th International Conference on Intelligent Tutoring Systems.

Begoli, Edmon; Camacho-Rodríguez, Jesús; Hyde, Julian; Mior, Michael, & Lemire, Daniel (2018). Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources. In Proceedings of the 2018 ACM International Conference on Management of Data (SIGMOD) (p. 221-230). https://doi.org/10.1145/3183713.3190662

Chambi, Samy; Lemire, Daniel, & Godin, Robert (2016). Nouveaux modèles d’index bitmap compressés à 64 bits. In Actes des 12es journées francophones sur les Entrepôts de Données et l'Analyse en Ligne.

Chambi, Samy; Lemire, Daniel; Godin, Robert; Boukhalfa, Kamel; Allen, Charles, & Yang, Fangjin (2016). Optimizing Druid with Roaring bitmaps. In Proceedings of the 20th International Database Engineering & Applications Symposium. ACM. ISBN 978-1-4503-4118-9 https://doi.org/10.1145/2938503.2938515

Li, Jing; Yan, Yuhong, & Lemire, Daniel (2016). Scaling up Web Service Composition with the Skyline Operator. In Proceedings of the IEEE International Conference on Web Services 2016.

Li, Jing; Yan, Yuhong, & Lemire, Daniel (2015). A web service composition method based on compact K2-trees. In Proceedings of the IEEE International Conference on Services Computing (p. 403 - 410). IEEE. ISBN 978-1-4673-7280-0 https://doi.org/10.1109/SCC.2015.62

Chambi, Samy; Lemire, Daniel, & Godin, Robert (2014). Roaring bitmap : nouveau modèle de compression bitmap. In Actes des 10e journées francophones sur les Entrepôts de Données et l'Analyse en Ligne.

Li, Jing; Yan, Yuhong, & Lemire, Daniel (2014). Full Solution Indexing Using Database for QoS-aware Web Service Composition. In Proceedings of the IEEE International Conference on Services Computing (p. 99 - 106). IEEE. ISBN 978-1-4799-5065-2 https://doi.org/10.1109/SCC.2014.22

Aouiche, Kamel; Lemire, Daniel, & Godin, Robert (2008). Collaborative OLAP with Tag Clouds: Web 2.0 OLAP Formalism and Experimental Evaluation. In Proceedings of WEBIST 2008. Portugal : Institute for Systems and Technologies of Information, Control and Communication.

Aouiche, Kamel, & Lemire, Daniel (2007). A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP. In Proceedings of the 10th International Workshop on Data Warehousing and OLAP. ACM.

Aouiche, Kamel, & Lemire, Daniel (2007). Unasssuming View-Size Estimation Techniques in OLAP. In Proceedings of the 9th International Conference on Enterprise Information Systems. Portugal : INSTICC.

Kaser, Owen, & Lemire, Daniel (2007). Removing Manually-Generated Boilerplate from Electronic Texts: Experiments with Project Gutenberg e-Books. In Spencer, Bruce; Story, Margaret-Ann, & Stewart, Darlene (Ed.), Proceedings of the 2007 Conference of the Center for Advanced Studies on Collaborative Research (CASCON '07). Riverton, NJ, É.-U. : IBM.

Kaser, Owen, & Lemire, Daniel (2007). Tag-Cloud Drawing: Algorithms for Cloud Visualization. In Proceedings of the Tagging and Metadata for Social Information Organization Workshop, 16th International World Wide Web Conference (WWW 2007). Banff, Canada : IW3C2.

Kucerovsky, Dan, & Lemire, Daniel (2007). Monotonicity Analysis over Chains and Curves. In Curve and surface fitting: Avignon 2006 (p. 180-190). Brentwood, TN, É.-U. : Nashboro Press.

Kaser, Owen; Lemire, Daniel, & Keith, Steven (2006). The LitOLAP Project: Data Warehousing with Literature. In Proceedings of the 2006 CaSTA Conference. University of New Brunswick.

Brooks, Martin; Yan, Yuhong, & Lemire, Daniel (2005). Scale-Based Monotonicity Analysis in Qualitative Modelling with Flat Segments. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence. Edinburgh, UK : IJICAI.

Lemire, Daniel (2005). A Better Alternative to Piecewise Linear Time Series Segmentation. In Apte, Chid; Skillicorn, David; Liu, Bing, & Parthasara, Srinivasan (Ed.), Proceedings of the 2007 SIAM International Conference on Data Mining (SDM'07) (p. 545-550). Minneapolis, Minnesota : SIAM. https://doi.org/10.1137/1.9781611972771.59

Lemire, Daniel; Brooks, Martin, & Yan, Yuhong (2005). An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation. In Han, Jiawei; Wah, Benjamin W.; Vijay, Raghavan; Wu, Xindong, & Rastogi, Rajeev (Ed.), Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM-05) (p. 709-712). Piscataway, NJ : IEEE. https://doi.org/10.1109/ICDM.2005.25

Lemire, Daniel, & Maclachlan, Anna (2005). Slope One Predictors for Online Rating-Based Collaborative Filtering. In Kargupta, Hillol; Srivastava, Jaideep; Kamath, Chandrika, & Goodman, Arnold (Ed.), Proceedings of the 2005 SIAM International Conference on Data Mining (SDM'05) (p. 471-475). Newport Beach, CA : SIAM.

Kaser, Owen, & Lemire, Daniel (2003). Attribute Value Reordering for Efficient Hybrid OLAP. In Rizzi, Stefano, & Song, Il-Yeol (Ed.), Proceedings of the ACM Sixth International Workshop on Data Warehousing and OLAP (p. 1-8). New Orleans, LA : ACM.

Lemire, Daniel (2003). A Family of 4-Point Dyadic Multistep Subdivision Schemes. In Cohen, Albert; Merrien, Jean-Louis, & Scumaker, Larry L. (Ed.), Curves and Surface Fitting: Saint-Malo 2002 (p. 259-268). Brentwood, TN, USA : Nashboro Press.

Lemire, Daniel (2002). Wavelet-Based Relative Prefix Sum Methods for Range Sum Queries in Data Cubes. In Stewart, Darlene A., & Johnson, J. Howard (Ed.), Proceedings of the 2002 Conference of the Center for Advanced Studies on Collaborative Research (CASCON '02) (p. 6). Riverton, NJ, USA : IBM.

Webb, Hazel; Kaser, Owen, & Lemire, Daniel (2008). Pruning Attributes From Data Cubes with Diamond Dicing. In IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications. ACM International Conference Proceeding Series.

Kaser, Owen; Lemire, Daniel, & Aouiche, Kamel (2008). Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes. In Proceedings of the 11th ACM International Workshop on Data Warehousing and OLAP. ACM.

Lemire, Daniel, & Vellino, Andre (2011). Extracting, Transforming and Archiving Scientific Data. In Proceedings of the Fourth Workshop on Very Large Digital Libraries. DELOS Association for Digital Libraries.

Ruer, Perrine; Gouin-Vallerand, Charles; Zhang, Le; Lemire, Daniel, & Vallières, Évelyne F. (2015). An analysis tool for the contextual information from field experiments on driving fatigue. In Proceeding of the Ninth International and Interdisciplinary Conference on Modeling and Using Context (Context 2015). Springer, coll. « LNAI ».

Anderson, Michelle; Ball, Marcel; Boley, Harold; Greene, Stephen; Howse, Nancy; Lemire, Daniel, & McGrath, Sean (2003). RACOFI: A Rule-Applying Collaborative Filtering System. In Proceedings of the IEEE/WIC COLA 2003.

Plaisance, Jeff; Kurz, Nathan, & Lemire, Daniel (2015). Vectorized VByte Decoding. In Proceedings of the First International Symposium on Web Algorithms.

Conference presentations (refereed)

Plante, Patrick; Desjardins, Guillaume; Dionne, Pierre-Olivier; Marineau, Sophie; Paré, Jean-François; Sauvé, Louise; Savard, Isabelle; Pinard, Anne-Marie; Lemire, Daniel, & Angulo Mendoza, Gustavo Adolfo (Oct 2019). Game Design Service Platform for Seniors' Health and Well-being. Poster presented at the AGE-WELL 2019 Annual Conference, Moncton, Canada.

Aouiche, Kamel; Lemire, Daniel, & Kaser, Owen (Jun 2008). Tri de la table de faits et compression des index bitmaps avec alignement sur les mots. Paper presented at the 24ièmes journées 'Bases de Données Avancées'.

Papers in conference proceedings (non refereed)

Lemire, Daniel (2021). Unicode at Gigabytes per Second. In Lecroq, Thierry, & Touzet, Hélène (Ed.), SPIRE 2021: String Processing and Information Retrieval. https://doi.org/10.1007/978-3-030-86692-1_2

Other non refereed contributions

Desjardins, Guillaume, & Plante, Patrick (2021). Guide des bonnes pratiques pour la conception de jeux sérieux et thérapeutiques destinés aux aînés (in collaboration with Marineau, Sophie; Angulo Mendoza, Gustavo Adolfo; Savard, Isabelle; Pinard, Anne Marie; Lemire, Daniel; Paré, Jean-François, & Pouliot, Sylvie) (Rapport de recherche). Québec, Canada : Observatoire du numérique en éducation.

Awards & Honors

Recognition

Teaching

  • Sherpa Award(2023) for my dedication to the students

Industry prizes

  • Google Open Source Peer Bonus Program (2012)

Paper awards

  • Best student paper award (IEEE SCC 2014)
  • Best paper award (CASCON 2002)

Community Service

PUBLIC appearances

Program committee (international conferences)

  • ACM Conference on Information and Knowledge Management (ACM CIKM)
  • ACM Conference on Web Search and Data Mining (ACM WSDM)
  • ACM Conference on Information Retrieval (ACM SIGIR)
  • ACM Conference on Recommender Systems (ACM RecSys)
  • ACM/IEEE Joint Conference on Digital Libraries (JCDL)

Funding bodies

  • FRQNT: review committee 03F (theoretical computer science) since 2007.
  • FRQNT: review committee 309 (team projects in computer science) since 2006.
  • NSERC: Research Tools and Instruments Grants Program (2012-2015)
  • NSERC: Computer Science Evaluation Group (EG 1507) for the Discovery Grants Program (2018-2021), committee co-chair in 2019-2020 and 2020-2021

external referee (Ph.D.)

  • Luca Versari of Pisa University, Italy (2021) - supervised by Roberto Grossi.
  • Kareem El Gebaly at Waterloo University (2018) - supervised by Jimmy Lin, Lukasz Golab and Ashraf Aboulnaga.
  • Mohammed Shaaban at Université Pierre et Marie Curie (2017) - supervised by Patrick Garda.
  • Mehdi Boukhechba at UQAC (2016) - supervised by Abdenour Bouzouane and Charles Gouin-Vallerand.
  • Hicham Assoudi at UQAM (2016) - supervised by Hakim Lounis.
  • Khaled Dehdouh at Lyon 2 (2015) - supervised by Omar Boussaid.
  • Martin Leginus at Aalborg University (2015) - supervised by Peter Dolog.
  • Ahmad Taleb at Université Concordia (2011) - supervised by Todd Eavis.

EXTERNAL REFEREE (Promotion)

  • Sabine Loudcher Rabaseda at Université Lyon2 - habilitation.
  • Jason Sawin at Université of St. Thomas.
  • Amer Nizar AbuAli at Philadelphia University.
  • Jinan Fiaidhi at Lakehead University.

JOURNAL

  • Editor, Software: Practice and Experience (2021-...).
  • Distinguished Referee, Software: Practice and Experience, 2018.
  • Associate editor, Heliyon Computer Science (2015-2023).