Solution Overview

NPUsearch is a computational storage appliance that enables fast, fixed-throughput regular expression based search of files. Each appliance consists of a 2U rack-mount server with 24x Samsung SmartSSDsTM, for a total of 96TB of SSD storage. NPUsearch allows all files on the appliance to be searched in 25 minutes or less. Appliances can be aggregated to support multiple petabytes of searchable storage, while keeping a maximum search time of 25 minutes. NPUsearch functionality is exposed through a Python library; users submit a list of regular expressions they want to search for and a list of files they want to search. NPUsearch returns a list of files that match one or more of the submitted expressions. Many customers use NPUsearch as a pre-filter near the beginning of their data analytics pipeline.

Lab Diagram