Massively Scalable Computational Finance with SciDB

April 5, 2018 | Author: Anonymous | Category: Software
Report this link


Description

1. Massively Scalable Computational Finance with SciDB Bryan Lewis Chief Data Scientist Frank Smietana Solutions Architect 2. ©Paradigm4 GoToWebinar • Ask questions using the Q&A window • This webinar is being recorded • Replays will be available from paradigm4.com 3. ©Paradigm4 Common issues • Expensive data ETL • Lack of horizontal scalability • Hard to program • Hard to extend • Difficulty with data JOINS 4. ©Paradigm4 What is SciDB? Massively scalable distributed array database 5. ©Paradigm4 What is SciDB? Open source 6. ©Paradigm4 Mike Stonebraker CTO What is SciDB? 7. © Paradigm4 Inc. Lawrence Berkeley NASA Goddard Projects using satellite image data Institute for Geoinformatics Global land change analysis on remote sensing data (LANDSAT, MODIS, SENTINEL) Lawrence Berkeley Big Science and SciDB 8. ©Paradigm4 Commercial applications Pharma, Biotech, Healthcare Quantitative Finance Image & Sensor Analytics E-commerce 9. ©Paradigm4 Arrays for finance Symbol Time 10. ©Paradigm4 Fast multidimensional SELECTs 11. ©Paradigm4 Table model i j data 1 1 0.5 1 2 0.3 1 3 0.1 1 4 -0.5 2 1 0.9 2 2 0.0 2 3 -0.8 2 4 -0.8 3 1 1.1 3 2 1.0 3 3 1.2 3 4 1.5 4 1 0.9 4 2 1.0 4 3 1.2 4 4 1,5 12. ©Paradigm4 Array model 0.5 0.3 0.1 -0.5 0.9 0.0 -0.8 -0.8 1.1 1.0 1.2 1.5 0.9 1.0 1.2 1.5 j i (1,1) 13. ©Paradigm4 Our approach • Less data movement • Spatial data clustering • Leverage popular languages • Extensibility 14. ©Paradigm4 C++ Julia Java/JVM Javascript Array SQL Use Popular Languages JDBC Protocol buffers C/C++ API HTTP 15. ©Paradigm4 SciDB 0 SciDB … SciDB 1 SciDB 2 Shared-nothing architecture 16. ©Paradigm4 Common issues • Expensive data ETL • Lack of horizontal scalability • Hard to program • Hard to extend • Difficulty with data JOINS 17. ©Paradigm4 SciDB • Minimize ETL • Massively scalable • Program from many languages • Open-source extensibility • Fast parallel JOIN 18. ©Paradigm4 Poll 19. ©Paradigm4 Examples • Order books • Network analysis 20. ©Paradigm4 Order book challenges • Lots of exchanges • Regulatory compliance • Margins are shrinking • Want more alpha 21. ©Paradigm4 Create order book • Load raw data into array • Dimension along symbol and time coordinate axes • Create order book entries with custom aggregation function ORDERBOOK https://github.com/Paradigm4/orderbook-example 22. ©Paradigm4 Consolidate order books • Load as arrays • Merge into single array • Impute missing value (inexact temporal join) • Aggregate by time and symbol 23. ©Paradigm4 Example Order Books 24. ©Paradigm4 Merge and impute 25. ©Paradigm4 Consolidated Order Book 26. ©Paradigm4 Benchmark Results • 9 exchanges; 358,000,000 events; 8,000 symbols • Order book depth: 10 27. ©Paradigm4 Financial network analysis 28. ©Paradigm4 A graph 29. ©Paradigm4 Sparse matrix representation 30. ©Paradigm4 Bitcoin transactions A directed graph Represented as a nonsymmetric sparse matrix From address To address Date, Amount, Transaction ID 31. ©Paradigm4 Bitcoin network schema (using the Reid/Harrigan user ID method) 32. Identify important nodes • Kleinberg HITS method • Subgraph centrality • Fielder clustering • Other methods... 33. Bitcoin subgraph centrality • Identify top 5 most central hub and authority nodes • 16.3M nodes • 6.3M x 6.3M sparse matrix • 8-instance SciDB cluster on a single workstation (8 cores) • 20 seconds 34. © Paradigm4 Inc. Correlation network 1 Compute bar data closing prices from TAQ trades 2 na.locf imputation 3 Correlation matrix across all instruments 4 Regularize 5 Precision matrix 6 Threshold 7 Plot clusters All inside SciDB up to plot 35. Take away • Bringing the analysis to the data • In-database complex math • Parallel time series analysis • Programmable from C++, R, Python ... • MPP on commodity clusters, clouds • Extensible, open-source www.paradigm4.com 36. © Paradigm4 Inc. Questions? Tell us about your application • [email protected] Try our Quick Start • scidb.org/forum • Download a VM or EC2 AMI www.paradigm4.com


Comments

Copyright © 2024 UPDOCS Inc.