Showing posts from February, 2019

Azure Databricks:- Read and write data into SQL dataase

Here in this post I would like to explain how we can connect SQL Server database from databricks to read and write. One of my customer project need this as the processed data is moving from Azure data lake layer to the aggregate layer which is SQL server database. Steps to connect to SQL server from databricks is clearly written in azure documentation but I would like to describe my experience. Code is developed in spark Scala Class.forName(" ") val jdbcUsername = dbutils.secrets.get(scope = " dev-cluster-scope ", key = " dev-sql-user ") val jdbcPassword = dbutils.secrets.get(scope = " dev-cluster-scope ", key = " dev-sql-pwd ") val jdbcHostname = " SQL Server Name here " val jdbcPort = 1433 val jdbcDatabase = " database name " - val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase}" // Create a Properties()

Azure Databricks : Mounting to ADLS

Databricks File System( DBFS) allows to store all the processed or unprocessed records into their file system. My customer is not ready to keep any data into DBFS as they believe it is not as secured as Azure data lake store (ADLS). ADLS is not mounted to Databricks by default and hence it is my turn to mount the ADLS to the source layer to store the data for Databricks to process and store. In order to continue with mounting of ADLS with databricks, make sure the below steps have completed successfully. 1. Install databricks 2. Install Azure data lake store 3. Register the databricks with azure active directory which is required to link the databricks with AD. Once you register the databricks app, will get service principleID and this ID should be provided at the time of mounting. lets go through the app registration process first. Steps: click on Azure active directory and select app registration from the left side of the window. now clic