Potential Future Options
FeatureDescriptionMicrosoft Azure DatabricksSnowflakeAzure Synapse SQL PoolsRedshift
Billing
ComputeHow compute billing works in each platform.Pay-as-you-go model and a pre-purchased capacity model.Pricing is based on the total amount of computing resources used by the Databricks workspace.Pay-as-you-go (Minimum 60 seconds and per second increments)Hourly for non-dedicated SQL pool For dedicated SQL pools, which are required to use  T-SQL, while billing can be hourly the main  deployment model is a consistent run model (like  IaaS) and it makes more sense to deploy RI’s.Hourly or Monthly depending on whether On-Demand or Reserved instances are used
Auto-suspend / Auto-Resume for ComputeAbility to pause and resume billing for compute programmatically through the platform and save costYesYesNo. Needs to be Manually controlled either from GUI or API. If suspended will not be reachable by any downstream systems.No. Manually controlled and platform will not be reachable by downstream systems.
 
Concurrency isolationAbility to run many queries through multiple business users without affecting each other and not introduce a performance impact.Yes – via Workspace Level Permissions, Cluster Access Controls, Job Scheduling and Job IsolationYes – via separate Virtual Warehouses for different workloadsYes – dedicated SQL Pool. Requires ongoing administration to monitor capacity limits.Yes via a clustered approach to deployment
Downtime during ScalingInterruption to live business users when scaling resources up or down e.g. no outage.Yes – recommended during off-peak hoursNo. Seamless scalingYes. stops all connections, rolls back existing transactions.No. Scaling within defined parameters is automatic but will have a ‘lag’ time
Administration (Manual effort intensive)  
Partition strategyData distribution strategy between compute nodes(maintenance overhead – lower TCO)YesNot requiredYesNot required
Index MaintenanceCreation and maintenance of Indexes for tuning(maintenance overhead – lower TCO)YesNot requiredYesNot required
Materialized View MaintenancePeriodic refresh of materialized views(maintenance overhead – lower TCO)YesNot requiredYesYes
Statistics CollectionRefresh of table statistics(maintenance overhead – lower TCO)YesNot requiredYesYes, can be automated
Working with data  
Structured data supportSupport for basic data in tables which has rows and columns e.g. traditional database.YesYesYesYes
Semi-structured data supportSupport for advanced data structures e.g. semi structured data.NoYesNo – Additional services requiredYes
Unstructured data supportSupport for complex data structures e.g. data from our buildings which provide information on sustainability and building occupancy stats.NoYesNo – Additional services requiredNo – Additional services required
Enterprise Readiness
Secure Data SharingSecure sharing of data with external business partners such as auditors and major lease clients such as EY.Yes – via Access Controls, Encryption, Data Masking, VNet Peering and Private EndpointsYes – Built In (Sharing is multi-cloud and multi region with zero copy movement within the same cloud region)Yes – Azure Data ShareYes
Data CloningThe ability to easily replace product data for the business to perform scenario modelling and analysis.YesYes – Zero Copy CloneNoNo
Time TravelThe ability to roll data back in seconds to a previous version.Yes – via Delta Lake Time Travel, Data Versioning, Backup and RestoreYes – up to 90 daysNoNo
Connectors & DriversSupported programming languages and clients
(consider removal)
Python, SQL, Scala, RSpark, Python, .NET, Kafka, ODBC, JDBC, PHP, Go and Node.jsSpark, Python, .NET, Spark SQL, ADO.NET, ODBC, PHP, and JDBCSpark, Python, .NET, Kafka, ODBC, JDBC, PHP, Go and Node.js
Language SupportSupported languages for in-database processing(consider removal)Python, SQL, Scala, R SQL, Scala/Java (Dataframes) and Python in the futureSQLSQL
Geospatial Data SupportNative support for geospatial data such as location-based data from our field devices.Yes – using GeoPandas and MagellanYesNoYes
Security 
Azure AD integrationSupport for Federated login, Active Directory Users and GroupsSupportedSupportedSupportedSupported
OAuth / MFA YesYesYesYes
Role Based Access Control Ability to create a role which has access to a number or tables / data.YesYesYesYes
Data EncryptionAbility to store data in an encrypted formatYes – DefaultYes – DefaultYes – DefaultYes – Default
Data MaskingAbility to mask/hide sensitive information such as Credit Card details or Personally identifiable information.Yes YesYesYes
Data CompressionAbility to store data in a compressed format to reduce costs.Yes – through Snappy Compression / Gzip CompressionDefaultYes – Through compression codecs / file formatsYes – through compression codecs / file formats
Data Access AuditAbility to look up who has historically accessed specific data within the platformYesYesYesYes
Built in Auto Classification & AnonymisationAbility to automatically classify and tag sensitive data attributes such as personally identifiable information and subsequently automatically anonymise data.No- Requires additional toolingYes – Currently in private previewNo- Requires additional toolingNo- Requires additional tooling
Object TaggingAbility to add custom tags to our data such as cost centre or custom tag sensitive data attribute.YesYesNoNo
 
Deployment ModelHow is this service consumed by usSoftware As a Service (SaaS)Software As a Service (SaaS)Platform As a Service (PaaS)Platform As a Service (PaaS)

Leave a Reply

Your email address will not be published. Required fields are marked *