tag:blogger.com,1999:blog-21690558263463782572024-03-26T15:00:47.785+05:30Discover BIA blog on Business Intelligence, Data Warehouse Tools and Databases.Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.comBlogger132125tag:blogger.com,1999:blog-2169055826346378257.post-29442654670055832972024-03-20T15:00:00.004+05:302024-03-20T15:00:47.460+05:30Power BI deployment premium backend Services error<p><b>Issue</b>:- Deployment pipeline fails from UAT to Production for the semantic model with the below error, </p><p><span style="background-color: white; font-family: SegoeUI, Lato, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 15px;"><i>An error occurred in Power BI Premium backend services. Other As Persistence error. </i></span></p><div _ngcontent-ng-c2184530476="" class="message-banner-text ng-star-inserted" style="background-color: #fde7e9; color: #bc2f32; font-family: var(--fluent-font-family, "Segoe UI", "Segoe UI Web (West European)", -apple-system, BlinkMacSystemFont, Roboto, "Helvetica Neue", sans-serif); font-size: var(--md-font-size, 14px); font-weight: var(--font-weight-semi-bold, 600);"><label _ngcontent-ng-c2184530476="" class="ng-star-inserted"><i>Deployment was stopped</i></label></div><div _ngcontent-ng-c862191256="" class="history-error-banner-message" style="background-color: #fde7e9; color: #bc2f32; font-family: "Segoe UI", wf_segoe-ui_normal, helvetica, arial, sans-serif; font-size: 12px;"><i>The artifact couldn't be deployed to the target workspace. Try deploying the content again.</i></div><p><span style="background-color: white; font-family: SegoeUI, Lato, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 15px;"><i></i></span></p><div class="separator" style="clear: both; text-align: center;"><i><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhKnqICpnOghPjcFQqIjgnlbxcA3SIrpzJyx51B6o9k6Cid3hCdiT-5O1PsR7ApJpP0kP0Fk-kt63EWnIDoA6kYWY-USh3laFUM5aWZX5MgHW2KouOhD-EGkLWBuwMx-ttXxfGaR22kPVyk35qKOHRzcaGKhDmIJx6CPZe-DMlhUHgve8VttRZuyqCAlZk" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="141" data-original-width="518" src="https://blogger.googleusercontent.com/img/a/AVvXsEhKnqICpnOghPjcFQqIjgnlbxcA3SIrpzJyx51B6o9k6Cid3hCdiT-5O1PsR7ApJpP0kP0Fk-kt63EWnIDoA6kYWY-USh3laFUM5aWZX5MgHW2KouOhD-EGkLWBuwMx-ttXxfGaR22kPVyk35qKOHRzcaGKhDmIJx6CPZe-DMlhUHgve8VttRZuyqCAlZk=s16000" /></a></i></div><i><br /><br /></i><p></p><p><span style="background-color: white; font-family: SegoeUI, Lato, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 15px;"><b>Cause</b> - Power BI forums or any other resource doesn't seem to have an explanation for this as the backend services are handled by Microsoft. I read through multiple forums as to see what is happening here with little luck. The only thing I could infer is that this seems to be happening for datamodels with two or more sources, which seems to be the case with us.</span></p><p><span style="background-color: white; font-family: SegoeUI, Lato, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 15px;">The guess is that Microsoft is already aware of the same as this has started happening since 2023 as per some posts I saw on the community. </span></p><p><span style="background-color: white; font-family: SegoeUI, Lato, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 15px;"><b>Resolution</b>: This is more of a work around than a solution suggested by one of the users at this <a href="https://community.fabric.microsoft.com/t5/Service/An-error-occurred-in-Power-BI-Premium-backend-services/m-p/3074121" target="_blank">community forum page</a>, which also worked for us. </span></p><p></p><p style="text-align: left;"></p><ul style="text-align: left;"><li>Deploy the power bi dataset to the workspace and let this error out. </li></ul><ul style="text-align: left;"><li>Post deployment pipeline being triggered, take the original Power BI semantic model or dataset pbix file and publish it directly to another workspace, in my case I published it to "My workspace".</li></ul><ul style="text-align: left;"><li>Post manual deployment of semantic model/dataset PBIX to a seperate workspace, go back to the original deployment pipeline and then repeat the deployment from lower environment to the environment where deployment failed last time, this should work. Worked for us.</li></ul><p></p><p style="text-align: left;">Weird as it may sound, it seems to be working for us and a few other people.</p><p></p><p>References - <a href="https://community.fabric.microsoft.com/t5/Service/An-error-occurred-in-Power-BI-Premium-backend-services/m-p/3074121">https://community.fabric.microsoft.com/t5/Service/An-error-occurred-in-Power-BI-Premium-backend-services/m-p/3074121</a> </p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0Berlin, Germany52.520006599999988 13.40495427.104592905373384 -21.751296000000028 77.9354202946266 48.561204000000032tag:blogger.com,1999:blog-2169055826346378257.post-78584375558259119702024-02-13T22:34:00.001+05:302024-02-13T22:34:15.786+05:30Synapse Devops deployment error - service principal expiration<p><b>Issue</b>: Deploying code from development environment in synapse to higher environments using Azure Devops failed due to the below error </p><p><span style="background-color: #9fc5e8; color: white; font-family: Menlo, Consolas, "Courier New", monospace; font-size: 14px; white-space-collapse: preserve;"><i>Encountered with exception:Error: Get workspace location error: Could not fetch access token for Azure. Verify if the Service Principal used is valid and not expired. For more information refer https://aka.ms/azureappservicedeploytsg</i></span></p><p><i style="background-color: #9fc5e8;"><span style="color: #eeeeee;">An error occurred during execution: Error: Get workspace location error: Could not fetch access token for Azure. Verify if the Service Principal used is valid and not expired.</span></i></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7T5m1ioWMb7yt3gelQG-40ROjAFM0YDP5OVO63PkYxnbnY_DjtDrmCwAdxU4jTEGb9obhRSteQfkehVYSufOB0tEp_YUsbhxfYuVbbDcgBWIvDg5wI3bi-u2BCE0UrDJCh7eLJ4R7aYhiRWFfpQml1ZR62MC45DNVS0TH28Oo71jSwoG0zkgtSNecG8c/s1365/Synapse_Devops_Deployment_Service_Principal_Error.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="72" data-original-width="1365" height="22" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7T5m1ioWMb7yt3gelQG-40ROjAFM0YDP5OVO63PkYxnbnY_DjtDrmCwAdxU4jTEGb9obhRSteQfkehVYSufOB0tEp_YUsbhxfYuVbbDcgBWIvDg5wI3bi-u2BCE0UrDJCh7eLJ4R7aYhiRWFfpQml1ZR62MC45DNVS0TH28Oo71jSwoG0zkgtSNecG8c/w422-h22/Synapse_Devops_Deployment_Service_Principal_Error.JPG" width="422" /></a></div><b><p><b><br /></b></p>Cause</b>: If you are using Azure Devops, you should know that Azure devops needs access to all the environments you are moving code and it uses a service principal to get access to the resources. When your IT admin creates a service principal which is used for Azure Devops, they normally set up a period till when the secret expires. For us we went over the threshold and hence the error. <p></p><p>Resolution: Asking your Windows AD/IT admin to create a new passcode value and using it in your devops by following below steps</p><p>Go to Azure Devops</p><p>Navigate to Project Settings > Pipelines > Service Connections and then to edit connection and this opens up a promt as shown below, here update the service prinicpal key you received from your Windows AD/Entra Admin, paste, verify and save it.</p><p><br /></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAW9-0GtDFRBoba0qHoLGRMTSu7YmCVH50lqtsxn4Sk1RXX2vIVCIzA4XW5DVx1A_8x5BPPNdiKIHYRAAOYLiVwwgy0CiqBXfdune1rqbZ4ktJ5mpGFqE19Blyk96h9d5KoRwlTYwMuj87VPmOyEmD8da6Nv55C5UfrpwdPSvfo__m8c-AUMtjOmMNGf0/s908/Save_New_SecretKey_ServicePrincipal.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="908" data-original-width="482" height="674" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAW9-0GtDFRBoba0qHoLGRMTSu7YmCVH50lqtsxn4Sk1RXX2vIVCIzA4XW5DVx1A_8x5BPPNdiKIHYRAAOYLiVwwgy0CiqBXfdune1rqbZ4ktJ5mpGFqE19Blyk96h9d5KoRwlTYwMuj87VPmOyEmD8da6Nv55C5UfrpwdPSvfo__m8c-AUMtjOmMNGf0/w358-h674/Save_New_SecretKey_ServicePrincipal.JPG" width="358" /></a></div><br /><p>Once you have given the new service principal key, we should be ready to start deploying again. </p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-27278954012020798332024-02-07T19:14:00.000+05:302024-02-07T19:14:00.612+05:30Working with com.crealytics.spark.excel package for excel files in Azure Synapse<p>This is a post to help atleast some of you who is trying to get the <span style="background-color: white; color: #a31515; font-family: Consolas, "Courier New", monospace; font-size: 14px; white-space: pre;">com.crealytics.spark.excel </span>package up and running in your synapse workspace and on your spark pool. I will try to explain it in the most simplest of steps. </p><p><span style="background-color: #b6d7a8;">Step 1</span> - Go to MVN repository and download the latest jar file for the crealytics excel spark package.</p><p></p><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY-9yW7_iFQXqbEg9h_Pbv9nwuBi84vRck8GhsQo3qrq_oYtLY3k9CMk7APnvYAglv8SLa-wuXxFj_TnmAa7tbz9oiUJxuAa07RnNITtgDQJ_Ju7gca0Jx7U27kngm89alTdUS_oVGkYMrtJklkHMNreUoMZ_DMtOhSg4bcQ_BSEcudtQ7DfrDc61f0Ck/s471/download_crealytics_excel_pakcage_jar_file.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="265" data-original-width="471" height="220" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY-9yW7_iFQXqbEg9h_Pbv9nwuBi84vRck8GhsQo3qrq_oYtLY3k9CMk7APnvYAglv8SLa-wuXxFj_TnmAa7tbz9oiUJxuAa07RnNITtgDQJ_Ju7gca0Jx7U27kngm89alTdUS_oVGkYMrtJklkHMNreUoMZ_DMtOhSg4bcQ_BSEcudtQ7DfrDc61f0Ck/w391-h220/download_crealytics_excel_pakcage_jar_file.JPG" width="391" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><div style="text-align: left;"><span style="background-color: #b6d7a8;">Step 2</span> - Once the file is downloaded go to your Synapse workspace and to the Manage tab, then to the Workspace packages tab</div><div style="text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbWU_SwPNEgFp3c4zn4FIp1-J1cfIN-4vHJOWuWz-iFVklCtSM7f1-mNmAMvImjFy1-M1ePbK5FEFlS14gv-l48PxsTUPNYbw6MeBiOSO1D_TO4qLyh8ktpJJinsrXFo7e05FUvdg7SAsbISXjY-moVXy64Ofm2m_z2us4mFfjMAeoYUNRyMCe1C6Bclk/s345/Workspace_Packages_Synapse.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="345" data-original-width="263" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbWU_SwPNEgFp3c4zn4FIp1-J1cfIN-4vHJOWuWz-iFVklCtSM7f1-mNmAMvImjFy1-M1ePbK5FEFlS14gv-l48PxsTUPNYbw6MeBiOSO1D_TO4qLyh8ktpJJinsrXFo7e05FUvdg7SAsbISXjY-moVXy64Ofm2m_z2us4mFfjMAeoYUNRyMCe1C6Bclk/w153-h200/Workspace_Packages_Synapse.JPG" width="153" /></a></div><br /><div style="text-align: left;"><span style="background-color: #b6d7a8;">Step 3</span> - Upload jar file to workspace packages and it should up on the list with provisioning status as succeded, see below.</div><div style="text-align: left;"><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjq6m2lMbexEhNISE6QB11pxyTR2iYiPCDRqq6PVCxYyvJChBouOm07kZRWyG0HEPdlJ45BQqnw-igaO8Two2LuaDTniNgVKUlTlI_CKajRKGyk8x_Dd4GLseqcoRvGcyaJziKvVucsVyyUaqTcVJDkX52-h07jWZnlw6n2wDr1QAM0ypltHkMdN7J3wAA" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="346" data-original-width="899" height="123" src="https://blogger.googleusercontent.com/img/a/AVvXsEjq6m2lMbexEhNISE6QB11pxyTR2iYiPCDRqq6PVCxYyvJChBouOm07kZRWyG0HEPdlJ45BQqnw-igaO8Two2LuaDTniNgVKUlTlI_CKajRKGyk8x_Dd4GLseqcoRvGcyaJziKvVucsVyyUaqTcVJDkX52-h07jWZnlw6n2wDr1QAM0ypltHkMdN7J3wAA" width="320" /></a></div></div></div><br /><br /><p></p><p></p><div></div><p></p><div><span style="background-color: #b6d7a8;">Step 4 </span>- Once the package is uploaded, go to Manage > Sparkpool > Packages and select the <span style="background-color: white; color: #323130; font-family: "Segoe UI"; font-size: 13px;">spark-excel_2.12-3.5.0_0.20.3.jar</span> from the list. Important that session level packages are allowed and the spark pool is restarted after this step. See screenshots below,</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiki3djBcTw9lF_h85b7EAnMpdknP90jyuQ1RFzlrMSmOI4kQFu8hwpE6TB2qfWZN71jqnzqSPciyBg9zBB0S8S9azhEJtWI38fiQu4KXQcZBM7VmTjgkIww62RGcKXQUq8Lyx4aU_0BNR5AONccWm5-sVFnMROunrqazktwroq3cII8l9cN8GesAobp8g/s606/Add_workspace_package_jar_to_sparkpool_synapse.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="606" data-original-width="601" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiki3djBcTw9lF_h85b7EAnMpdknP90jyuQ1RFzlrMSmOI4kQFu8hwpE6TB2qfWZN71jqnzqSPciyBg9zBB0S8S9azhEJtWI38fiQu4KXQcZBM7VmTjgkIww62RGcKXQUq8Lyx4aU_0BNR5AONccWm5-sVFnMROunrqazktwroq3cII8l9cN8GesAobp8g/w396-h400/Add_workspace_package_jar_to_sparkpool_synapse.JPG" width="396" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">That's it on the configuration side, now on your notebook, you could have a code snippet like below to read from an excel file.</div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;"><div style="background-color: white; font-family: Consolas, "Courier New", monospace; font-size: 14px; line-height: 18px; white-space: pre;"><i>df = spark.read.format(<span style="color: #a31515;">"com.crealytics.spark.excel"</span>).option(<span style="color: #a31515;">"header"</span>, <span style="color: #a31515;">"true"</span>)</i></div><div style="background-color: white; font-family: Consolas, "Courier New", monospace; font-size: 14px; line-height: 18px; white-space: pre;"><i> / .option(<span style="color: #a31515;">"inferSchema"</span>, <span style="color: #a31515;">"true"</span>).load(ReadPath)</i></div></div><br /><div>where ReadPath contains the path to the excel in your datalake. You can play around with more options on this piece of code. Hope this helps, please let us know in comments.</div><div><br /></div><div><br /></div><div>Note:- If you have higher environments, make sure you repeat the steps there. </div><div><br /></div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-22278541856610841012024-01-26T16:51:00.000+05:302024-01-26T16:51:12.210+05:30Copy files from Azure Data Lake to a SharePoint Site<p><b>Use Case:</b> We have output files from a use case processed using PySpark in Synapse and loaded into a path in Azure Data Lake as CSV and Excels, but we wanted to put these files into a SharePoint site which was easier for users in the organization to access. </p><p>Immediately as we started researching, we faced a <a href="https://learn.microsoft.com/en-us/answers/questions/558650/push-file-to-sharepoint-from-adls-using-copy-activ" target="_blank">limitation</a>(as of 26-01-2024) in achieving this use case with ADF inside Azure Synapse Analytics, SharePoint site is not yet supported as a sink in ADF/Synapse.</p><p>The only other alternative to doing this using a logic app, the steps are fairly simple.</p><p>On a high level the steps are as below,</p><p>List Blobs from Azure Data Lake - Here you define the storage account, the connection method( we used access key)</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgrE36Df3dnuxzzPmKtfOiS_BDVD0ZSrd6gRL_12yBEk2ZUjKV5sOLMUPtpuQDedCwekddlCZBBXKdYeX4-CZ50TKDyfHuEpyOr1BCsfmO_wTc9wnOPpw5z3sNjXd_o3ORNkJKOnJEBzApc1KcHwGUbAgedg1KEnyMV16ETRdupjvqKlApmXy33XrvB5WM" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="254" data-original-width="605" height="168" src="https://blogger.googleusercontent.com/img/a/AVvXsEgrE36Df3dnuxzzPmKtfOiS_BDVD0ZSrd6gRL_12yBEk2ZUjKV5sOLMUPtpuQDedCwekddlCZBBXKdYeX4-CZ50TKDyfHuEpyOr1BCsfmO_wTc9wnOPpw5z3sNjXd_o3ORNkJKOnJEBzApc1KcHwGUbAgedg1KEnyMV16ETRdupjvqKlApmXy33XrvB5WM=w400-h168" width="400" /></a></div><br />In the next step, we have a for each control loop which iterates the output from the list blobs step, then gets the blob content( pass the path to the file ) and then this blob conent is passed on to a SharePoint 'Create File' step. <div><br /></div><div>It is important that the path is set up correctly and that the security credentials used have adequate access to the SharePoint. <br /><p></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgnXFE4lOX8sMGFQAOuf-M42Ks9RaO99UQ1_QXru2KjJGX9p5c44WBbVq3y5rjjzwF-RkK5fuIiyGohv0n8_-2PUir-ukUSLkGcYyfohZeLKz9hIL0E27hnE-ZmluQjZQ_zWdnx0mRmqyWn2Jw7MhbLXPs7gA90dIB9vGfUllHTIC-lxLmd9Pf3i9fR-Mw" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="306" data-original-width="592" height="206" src="https://blogger.googleusercontent.com/img/a/AVvXsEgnXFE4lOX8sMGFQAOuf-M42Ks9RaO99UQ1_QXru2KjJGX9p5c44WBbVq3y5rjjzwF-RkK5fuIiyGohv0n8_-2PUir-ukUSLkGcYyfohZeLKz9hIL0E27hnE-ZmluQjZQ_zWdnx0mRmqyWn2Jw7MhbLXPs7gA90dIB9vGfUllHTIC-lxLmd9Pf3i9fR-Mw=w400-h206" width="400" /></a></div><br /><br /><p></p><p><span style="background-color: #fcff01; color: red;">Thing to note:</span> The 'Create file' comes with it's set of limitations as all things Microsoft do, where it can move a file of size greater than 1 GB, this is something we will have to live with for now( update as of 26th Jan). </p></div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-87248942300479203072023-11-24T15:31:00.003+05:302023-11-24T15:31:35.064+05:30Find Log Analytics Key of Azure Log Analytics Instance<p><b> Issue:</b> Find the keys to connect to a log analytics instance in Azure.</p><p><b>Solution:</b> You can find this under Log Analytics Workspace > Agents > Log Analytics Agent Instructions</p><p></p><div class="separator" style="clear: both; text-align: center;"><br /></div><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi34mzeY3qtrRKeW0hUGyM9EdswHevMUon4HVb_YocPe2svsqN-C3BuW-CRSKUDVNX2xspw4WoW6Z9qJLKwKUQEfCO8aKdRTURep4ol9gYKj35bulpY5OJt-QK3MozMWveWTkbV6yOKQiapSYmxBIiPNUgb_3IA8mnNSR_1miilBcGLg05NKLEkq7GjfVs/s842/Log_Analytics_Key_find.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="411" data-original-width="842" height="288" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi34mzeY3qtrRKeW0hUGyM9EdswHevMUon4HVb_YocPe2svsqN-C3BuW-CRSKUDVNX2xspw4WoW6Z9qJLKwKUQEfCO8aKdRTURep4ol9gYKj35bulpY5OJt-QK3MozMWveWTkbV6yOKQiapSYmxBIiPNUgb_3IA8mnNSR_1miilBcGLg05NKLEkq7GjfVs/w591-h288/Log_Analytics_Key_find.JPG" width="591" /></a></div><br />Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-88145439608742124422023-11-23T14:46:00.002+05:302023-11-23T14:46:31.567+05:30Synapse Analytics ADF Error - Failed to encrypt sub-resource payload or linked service credentials [Solved]<b>Issue:</b> While trying to publish an HTTP or REST linked service connection from master to live, we get an error if it is going through a self hosted Integration run time to an on-premise network. <div><br /></div><div><div></div></div><blockquote><div><div>Publishing error</div><div>Failed to encrypt sub-resource payload</div></div><div><br /></div><div>and error is: Failed to encrypted linked service credentials on self-hosted IR 'selfHostedIR', reason is: NotFound, error message is: No online instance..</div></blockquote><div></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgyC8-otxxF5YF0nH2cPJ1j33SQb0kxRQqNVIvPLFict6iCV5gp5R_yoyg4kigs0Im0q4f1mAIkwYanrSIK6i4fIu0Rb8jCdr4lazrIHsZ9Io1mUSc6jxRCvOQh0SYb8AoHhH6ja-6l6Pv8PMnuSIP8PEx11GD89TwtSccZ00yuW2BhNVtBpgEGdbHowKo/s635/Publish_error_HTTP_ADF_Selfhosted_IR.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="491" data-original-width="635" height="405" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgyC8-otxxF5YF0nH2cPJ1j33SQb0kxRQqNVIvPLFict6iCV5gp5R_yoyg4kigs0Im0q4f1mAIkwYanrSIK6i4fIu0Rb8jCdr4lazrIHsZ9Io1mUSc6jxRCvOQh0SYb8AoHhH6ja-6l6Pv8PMnuSIP8PEx11GD89TwtSccZ00yuW2BhNVtBpgEGdbHowKo/w523-h405/Publish_error_HTTP_ADF_Selfhosted_IR.JPG" width="523" /></a></div><br /><div><br /></div><div><b>Cause:</b> When you have a HTTP call that happens inside an on-premise network and you use a self hosted integration run time to connect to this, the first time the linked service is published it has to validate the connection and hence requires the Self Hosted IR to be up and running. </div><div><br /></div><div><b>Solution:</b> The solution is to have the self hosted IR( Azure VM in our case ) up and running while trying to publish from master to live. This enables the validation of the linked service connection, this fixed it for us.</div><div><br /></div><div>Please do let us know if this fix worked for you. </div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0India20.593684 78.96288-58.628552579376276 -61.66212 90 -140.41212000000002tag:blogger.com,1999:blog-2169055826346378257.post-70495798726516968322023-09-01T22:06:00.001+05:302023-09-01T22:06:14.019+05:30Storage Event Trigger Error - Register Azure Event Grid resource provider <b>Issue:</b> Trying to create a storage event trigger fails in Synapse/ Azure Data Factory with the below error <div><br /></div><div><span style="background-color: white; color: #e00b1c; font-family: "Segoe UI"; font-size: 13px;"><blockquote>Register Azure Event Grid resource provider to your subscription before creating an event trigger.</blockquote></span></div><div><span style="background-color: white; color: #e00b1c; font-family: "Segoe UI"; font-size: 13px;"><br /></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><b>Cause</b>: This is caused because Azure Event Grid resource provider is not added or enabled under your Azure Subscription. </span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><br /></span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;">If you are using Azure Synapse Analytics, then you also need to have the Microsoft.DataFactory resource provider regsitered.</span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><br /></span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><b>Resolution</b>: You can follow the steps to register event grid and data factory resource provider to your subscriptions.</span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><br /></span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><br /></span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><i>Step 1</i>: go to portal.azure.com and select the subscription where you want to create the storage event trigger.</span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><br /></span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><i>Step 2</i>: under the subscription go to settings > resource providers.</span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><br /></span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><i>Step 3</i>: Search and Register the two resource Providers </span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><br /></span></span></div><div><span style="font-family: Segoe UI;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjrueSTpOxp_inzUZizWjBQymj2QErtI2wB04PqnWW1dloPSEPnrB2odN1lFBY1XLokpKCl5sKfkSEAbBR-9ya3WDSlIDLUtIOZnORCQeHm0aEn0yJfcTZIpsHdJeQP5aRMMvWke5aVmnadQFb-BWJyHGua4sEhSBLBufKYo-iCUrlf1Rmfc6LLhbxqN4/s1040/DataFactory_Resource_Provider_Register.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="253" data-original-width="1040" height="97" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjrueSTpOxp_inzUZizWjBQymj2QErtI2wB04PqnWW1dloPSEPnrB2odN1lFBY1XLokpKCl5sKfkSEAbBR-9ya3WDSlIDLUtIOZnORCQeHm0aEn0yJfcTZIpsHdJeQP5aRMMvWke5aVmnadQFb-BWJyHGua4sEhSBLBufKYo-iCUrlf1Rmfc6LLhbxqN4/w400-h97/DataFactory_Resource_Provider_Register.JPG" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="font-size: x-small;">Registering Microsoft.DataFactory resource provider</span></td></tr></tbody></table><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzicsaOoy2Bl2R9rjC4evJWUTAFueAy1wuASc5Sfoozj5oHUIodgiOFBvTY9KSA6g4QHOyBS2HoUJ6dWcrRQXByDNnKuiGZSuk29oVvjAQBnU2gJFiStfDnPAp6qXCyaC2YozEmMvyqM3OsreXy4JZcN-QuNbCgfzJrgntVVa0tb6iNIaiJMmUHq5dCqI/s1098/EventGrid_Resource_Provider_Register.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="315" data-original-width="1098" height="115" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzicsaOoy2Bl2R9rjC4evJWUTAFueAy1wuASc5Sfoozj5oHUIodgiOFBvTY9KSA6g4QHOyBS2HoUJ6dWcrRQXByDNnKuiGZSuk29oVvjAQBnU2gJFiStfDnPAp6qXCyaC2YozEmMvyqM3OsreXy4JZcN-QuNbCgfzJrgntVVa0tb6iNIaiJMmUHq5dCqI/w400-h115/EventGrid_Resource_Provider_Register.JPG" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="font-size: small;">Registering Microsoft.EventGrid resource provider</span></td></tr></tbody></table></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;"><br /></span></span></div><div><span style="font-family: Segoe UI;"><span style="background-color: white; font-size: 13px;">Try again and now it should be working. Let us know if this worked for you.</span></span></div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-64362827706551944272023-08-31T20:50:00.000+05:302023-08-31T20:50:10.690+05:30ADF / Synapse - Error converting Excel to Parquet <p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWi7XFwIo1GCueUuMYm-pR_DhYhRZc2X0Ep58evgm5dcN0TndwBNePY9cVexRg3uYfFl1QXYS3vZa08lNwuBdgrNEOK4qW09965XpUdsEdfTpVUk2ZuhKOB3cfNbZM_hw6TP0w7g9DNHgMAb_uPWPCIzTX90_G0WS5sWLUIdAHbHFgYXBNTT0kjdArjpc/s628/Excel_to_Parquet_Error.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="273" data-original-width="628" height="174" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWi7XFwIo1GCueUuMYm-pR_DhYhRZc2X0Ep58evgm5dcN0TndwBNePY9cVexRg3uYfFl1QXYS3vZa08lNwuBdgrNEOK4qW09965XpUdsEdfTpVUk2ZuhKOB3cfNbZM_hw6TP0w7g9DNHgMAb_uPWPCIzTX90_G0WS5sWLUIdAHbHFgYXBNTT0kjdArjpc/w400-h174/Excel_to_Parquet_Error.JPG" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Error in Synapse / Azure Data Factory</td></tr></tbody></table><b><br /></b></p><p><b>Aim</b> - Build a pipeline that can take an excel as input in a file contianer and write into a parquet file. </p><p><b>Issue</b>:- When processing the file, the copy activity runs into an error saying one of the fields cannot have more than one value. </p><p></p><blockquote><p>"errorCode": "2200",</p><p> "message": "ErrorCode=ParquetJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred when invoking java, message: java.lang.IllegalStateException:field 42 (COUNTRY_NAME) can not have more than one value: [Ägypten]\ntotal entry:6\r\norg.apache.parquet.example.data.simple.SimpleGroup.add(SimpleGroup.java:106)\r\norg.apache.parquet.example.data.simple.SimpleGroup.add(SimpleGroup.java:196)\r\norg.apache.parquet.example.data.Group.add(Group.java:59)\r\norg.apache.parquet.example.data.Group.append(Group.java:137)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetBatchWriter.readRow(ParquetBatchWriter.java:112)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetBatchWriter.addRows(ParquetBatchWriter.java:60)\r\n.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,''Type=Microsoft.DataTransfer.Richfile.JniExt.JavaBridgeException,Message=,Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,'",</p><p> "failureType": "UserError",</p><p> "target": "Copy Bronze - Excel to Parquet",</p><p></p></blockquote><p><br /></p><p><b>Cause</b>:- This has to with the excel file having two or more columns with the same name. We had the field "COUNTRY_NAME" twice in the excel and this was causing the issue. </p><p><b>Fix</b>: Remove the duplicated field and re-process the file again.</p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-33499426355120837302023-03-30T19:53:00.004+05:302023-03-30T19:53:51.065+05:30Self Hosted Integration Runtime sharing between Synapse Workspaces <p><span style="font-family: arial;"><b>Scenario:</b> We have to connect to an on-premise SQL Server/Oracle database and have three synapse workspaces in Azure, DEV, UAT and PROD in three different subscriptions. </span></p><p><br /><span style="font-family: arial;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoo3Ia0VhL30rB8kw9nBTZ1fQ19l1cHkOufZlvGt5NZQqicq3lqUFF1ZvR7ZaFam8LarshpzCt9NWFM7LvhnAC8txnQAW2Yjzt9ukREwvJxtrVaJFkgkSZBv8QRbE7ZsRm1_n4RzWVwKSM_p315sV6t2UvnwZvpSxE0cbRakHWwgLAVx9v6tAovIFo/s942/Self_Hosted_IR_Azure_Synapse.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="637" data-original-width="942" height="270" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoo3Ia0VhL30rB8kw9nBTZ1fQ19l1cHkOufZlvGt5NZQqicq3lqUFF1ZvR7ZaFam8LarshpzCt9NWFM7LvhnAC8txnQAW2Yjzt9ukREwvJxtrVaJFkgkSZBv8QRbE7ZsRm1_n4RzWVwKSM_p315sV6t2UvnwZvpSxE0cbRakHWwgLAVx9v6tAovIFo/w400-h270/Self_Hosted_IR_Azure_Synapse.JPG" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Current- Architecture</td></tr></tbody></table><br /></span></p><p><span style="font-family: arial;">To connect to these sources on the on-premise networks we have virtual networks set up on Azure, gateways and site to site connections to local network.</span></p><p><span style="font-family: arial;">The downside of this is having to maintain a lot more from an administrator perspective,</span></p><p></p><ol style="text-align: left;"><li><span style="font-family: arial;">Three Virtual Networks </span></li><li><span style="font-family: arial;">Peering between three networks</span></li><li><span style="font-family: arial;">Three seperate Virtual Machines to host the respective Azure Self Hosted IRs</span></li></ol><div><br /></div><div><span style="font-family: arial;">Instead of this, we thought of simplifying it and having only one Virtual Network, one VM where self hosted integration runtime is hosted and the IR is shared by all three Synapse workspaces, see below architecture, </span></div><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisqbS2zIzBMsNR7f5w0lNLhjRezUpmgNoEBPIb6Vt-SFXyhh8QnK-Mw3A6xbjlinbAuvGnjdsPlD0azgrn4gK6sYYCIaqfVXiLjXodHCACZH9ywFv9sgg0IW7F2QT-Egif3Zqehyu68CO41Em5W-VsPGl_0OouMm-npnexhO6_qlUwgmCKrNwULz0o/s723/Self_Hosted_IR_Azure_Synapse_Shared.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="442" data-original-width="723" height="232" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisqbS2zIzBMsNR7f5w0lNLhjRezUpmgNoEBPIb6Vt-SFXyhh8QnK-Mw3A6xbjlinbAuvGnjdsPlD0azgrn4gK6sYYCIaqfVXiLjXodHCACZH9ywFv9sgg0IW7F2QT-Egif3Zqehyu68CO41Em5W-VsPGl_0OouMm-npnexhO6_qlUwgmCKrNwULz0o/w378-h232/Self_Hosted_IR_Azure_Synapse_Shared.JPG" width="378" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Proposed Architecture with Shared Self Hosted IR.</td></tr></tbody></table><span style="font-family: arial;"><br /></span></div><div><span style="font-family: arial;"><br /></span></div><div><span style="font-family: arial;">Now comes the road-block here, sharing integration run-time is not yet available for Synapse Analytics ( reference - </span><span style="font-family: arial;"><a href="https://learn.microsoft.com/en-us/azure/synapse-analytics/data-integration/concepts-data-factory-differences#available-features-in-adf--azure-synapse-analytics">https://learn.microsoft.com/en-us/azure/synapse-analytics/data-integration/concepts-data-factory-differences#available-features-in-adf--azure-synapse-analytics</a> </span><span style="font-family: arial;"> )</span></div><div><span style="font-family: arial;"><br /></span></div><div><div class="separator" style="clear: both; font-family: arial; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEirx2uUdEvdwgY0CA84Beb5wXWmFoPsupynWzdz-ctQq2GwfMbHswoDtcsaFAWvDnzzb8i7EIcoEzzqExqoHaD_OgZtcSe8Quv4p6i3fQ1WxjovFK9wkhb0u4SLzr1EI0ZVCEcFMwc9pOzcygCej9VxVFrS1krwgsuKDp06BJ8ZjTiT_hq_af1Kjr0p" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="423" data-original-width="885" height="191" src="https://blogger.googleusercontent.com/img/a/AVvXsEirx2uUdEvdwgY0CA84Beb5wXWmFoPsupynWzdz-ctQq2GwfMbHswoDtcsaFAWvDnzzb8i7EIcoEzzqExqoHaD_OgZtcSe8Quv4p6i3fQ1WxjovFK9wkhb0u4SLzr1EI0ZVCEcFMwc9pOzcygCej9VxVFrS1krwgsuKDp06BJ8ZjTiT_hq_af1Kjr0p=w400-h191" width="400" /></a></div><br /><span style="font-family: arial;"> </span></div><div><span style="font-family: arial;">This seems to be something in the backlog for the Synapse Analytics Product team, I will be updating here if this comes out in any new releases. This will be really cool to have. </span></div><div><br /></div><p></p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-38859006350886391492023-01-23T16:13:00.007+05:302023-01-23T22:49:52.575+05:30Azure Synapse Analytics workspace outage in West Europe - 23rd January 2023<p>There seems to be a region wide outage for Azure Synapse Analytics workspaces in the West Europe Region. We have not been able to open up our synapse workspaces hosted in West Europe region from the morning.</p><p><a href="https://twitter.com/ErwindeKreuk/status/1617435086483296257" target="_blank">Twitter users</a> were the first to report this to Azure support early in the morning today in CET time.</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgdc_aefgGclE_ruRJrGfVhyUF4ctLWLS90SOzaPD3kLp3UZCTFgy_6VFRYwe0kWllC5FwUH7E3ytTU-c0v21Ds8-j6GsCAfARyW1BTyRn5Szhc9ptsOLQLIzKTKvHiZ3A07EaWAUK1uIKhNVW2thfugBW8BmGBOHBvAzbEoDueRoDrreVt4ifGtEZv" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="681" data-original-width="827" height="240" src="https://blogger.googleusercontent.com/img/a/AVvXsEgdc_aefgGclE_ruRJrGfVhyUF4ctLWLS90SOzaPD3kLp3UZCTFgy_6VFRYwe0kWllC5FwUH7E3ytTU-c0v21Ds8-j6GsCAfARyW1BTyRn5Szhc9ptsOLQLIzKTKvHiZ3A07EaWAUK1uIKhNVW2thfugBW8BmGBOHBvAzbEoDueRoDrreVt4ifGtEZv" width="291" /></a></div><br /><br /><p></p><p>It is really weird as Microsoft Health Service doesn't show any service as down in west europe region,</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEi3_7sQjvEjBa5as4sYNBc00E5fCXzb1w06tprbjuCK-godCeh3J_-oN0Tl9ovp3srobJVCIAr-2d-Ljm1gyqBC1GErYGmzgtWhWWjCUOUeYDwHgzU4DeNCO4KaeyAr5-1EZsmhukeMz4iCk_cB7JpXkxYUBf77lUFablhSg9bSlUhxu7Dk2lhUmVRW" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="333" data-original-width="1210" height="88" src="https://blogger.googleusercontent.com/img/a/AVvXsEi3_7sQjvEjBa5as4sYNBc00E5fCXzb1w06tprbjuCK-godCeh3J_-oN0Tl9ovp3srobJVCIAr-2d-Ljm1gyqBC1GErYGmzgtWhWWjCUOUeYDwHgzU4DeNCO4KaeyAr5-1EZsmhukeMz4iCk_cB7JpXkxYUBf77lUFablhSg9bSlUhxu7Dk2lhUmVRW=w320-h88" width="320" /></a></div><br />A couple of hours later they have put up a notification as an emerging issue on the same. <p></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgKuvSWUmxLIoNEOJ2FbuzXiQfFWPm0EPTJovIkP08vzFv23SmlkUmv9IR4w7gkLS_bovzMYy9BxIq2vQXjzltLzhazwJXLccsB7JoGv6fUcNAwcjH0S4F7sb4D9VwWGheVXPB6oWYq0PG3QRT_uTmjIAHQpGvpz1VtNgOdZAEJxTWbssYhvXouS3An" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="233" data-original-width="1119" height="67" src="https://blogger.googleusercontent.com/img/a/AVvXsEgKuvSWUmxLIoNEOJ2FbuzXiQfFWPm0EPTJovIkP08vzFv23SmlkUmv9IR4w7gkLS_bovzMYy9BxIq2vQXjzltLzhazwJXLccsB7JoGv6fUcNAwcjH0S4F7sb4D9VwWGheVXPB6oWYq0PG3QRT_uTmjIAHQpGvpz1VtNgOdZAEJxTWbssYhvXouS3An" width="320" /></a></div><p><span face=""Segoe UI", "Segoe UI Web (West European)", "Segoe UI", -apple-system, BlinkMacSystemFont, Roboto, "Helvetica Neue", sans-serif" style="background-color: white; color: #323130; font-size: 13px;"><i>Impact Statement: Starting at 05:30 UTC on 23 Jan 2022, a number of Azure Synapse Analytics customers with resources in West Europe may be experiencing connectivity issues when trying to access their workspaces. Current Status: We are aware of this issue and are currently investigating to understand the underlying cause and work on a mitigation strategy. Further updates will be provided in 60 minutes, or as events warrant.</i></span></p><p>Microsoft is yet to provide an ETA on the same. There have also been multiple community posts raised asking for help on the same. </p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEiwEBJeMTM3d7Vn59AQ7zI5Qi_M9lBCrbHG1RDj8Za4psxgszGfUUWnHxhrCwhJIi19zKL5egzNUS6tEOZFa23DorKHcdsh-QUheJNGEQwajPFjgiH3UgoXJ8V6X4IZhXA_oxzOKAddCkMo90Q0h-qyzTbRhuGggSrhLgRQZ05K9CanA6CNOQaSKQu7" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="419" data-original-width="911" height="147" src="https://blogger.googleusercontent.com/img/a/AVvXsEiwEBJeMTM3d7Vn59AQ7zI5Qi_M9lBCrbHG1RDj8Za4psxgszGfUUWnHxhrCwhJIi19zKL5egzNUS6tEOZFa23DorKHcdsh-QUheJNGEQwajPFjgiH3UgoXJ8V6X4IZhXA_oxzOKAddCkMo90Q0h-qyzTbRhuGggSrhLgRQZ05K9CanA6CNOQaSKQu7" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><i>Users are calling out Azure for not reporting the same on health portal.</i></td></tr></tbody></table><br />We will keep following up and sharing udpates here. You could also follow the below community portal question to get updates on the same. c<div><br /></div><div>https://learn.microsoft.com/en-us/answers/questions/1163259/synapse-workspace-in-west-europe-not-opening-up<br /><div><br /></div><div><span style="color: red;"><span style="background-color: #fcff01;">update 11:05 AM UTC</span> </span></div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjaVX_2c-t9n9ZGoSDrPlikpNTZ0bYGtzESqWPBBXS1M-tGT0cgAk8j3zCVRVA8GwzX-RZ4NO_nUi7io1Nt1HUPQJWnZCEn1isUii0f90wZoqvYgv_pmM9S7t11zOLvbCSvvb183IF7Phv54fsaOZNHT7vuNWhATdS6cCfkqfWU7b94BdmzEbglaaVh" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="318" data-original-width="1217" height="167" src="https://blogger.googleusercontent.com/img/a/AVvXsEjaVX_2c-t9n9ZGoSDrPlikpNTZ0bYGtzESqWPBBXS1M-tGT0cgAk8j3zCVRVA8GwzX-RZ4NO_nUi7io1Nt1HUPQJWnZCEn1isUii0f90wZoqvYgv_pmM9S7t11zOLvbCSvvb183IF7Phv54fsaOZNHT7vuNWhATdS6cCfkqfWU7b94BdmzEbglaaVh=w640-h167" width="640" /></a></div><br /><p><span style="background-color: #fcff01; color: red;">Update 12:50 PMUTC</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjut-tE2Y1tPgrn6hYfS8zoU4l8IVjNuKkbU6YvYc1mNz3bJR09HC9fxeKCnOVabnhFn7ctfmOvH0Q6creyIZJenFRmiURo3crebxZ-CEZgkc7Vd2vghEHhjc_IqNYp_FBilvhcDsAJOnu_aCclblErbzvJMjwR4Mx7Zi6y2soFObNrlU8pHQ0w2bBa/s1710/Update_Synapse.JPG" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="468" data-original-width="1710" height="176" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjut-tE2Y1tPgrn6hYfS8zoU4l8IVjNuKkbU6YvYc1mNz3bJR09HC9fxeKCnOVabnhFn7ctfmOvH0Q6creyIZJenFRmiURo3crebxZ-CEZgkc7Vd2vghEHhjc_IqNYp_FBilvhcDsAJOnu_aCclblErbzvJMjwR4Mx7Zi6y2soFObNrlU8pHQ0w2bBa/w640-h176/Update_Synapse.JPG" width="640" /></a></div><p></p> Update 18:17 PM UTC ( Hotfix applied and rolled out)</div></div><div><br /></div><div>Our instances are up at the moment.</div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEj4Z7AD-MgDHcXqGQ34_9hYHVSQOf4YOtq0WPsJvqMuu5T7RvD2euTEVxETt11MBrd9pznzv68RgMkTpyFsAez5HqQTyJk4qGWZeOKoBrHPt404PzxN6EIqR4eBaIXrpzyvJ_iZkJS99uvcAGwwk05MUgd78N1-Wfocw8H2jkcghiDriYBMuvqOPjjR" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="476" data-original-width="1215" height="250" src="https://blogger.googleusercontent.com/img/a/AVvXsEj4Z7AD-MgDHcXqGQ34_9hYHVSQOf4YOtq0WPsJvqMuu5T7RvD2euTEVxETt11MBrd9pznzv68RgMkTpyFsAez5HqQTyJk4qGWZeOKoBrHPt404PzxN6EIqR4eBaIXrpzyvJ_iZkJS99uvcAGwwk05MUgd78N1-Wfocw8H2jkcghiDriYBMuvqOPjjR=w640-h250" width="640" /></a></div><br /><br /></div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-56327093235964663312023-01-18T22:46:00.003+05:302023-01-26T14:41:25.713+05:30Create Power BI user for Synapse Serverless SQL Pool to access External Delta Lake Tables<b>Problem Statement:</b> You are trying to create a seperate user for reading datasets in Power BI. <div><br /></div><div><b>Issue: </b>After creating the user login, you are facing this error while trying to read your external tables or view</div><div><br /></div><div><div><i>Msg 13807, Level 16, State 1, Procedure Dim_ABC, Line 2 [Batch Start Line 0]</i></div><div><i>Content of directory on path 'https://data.dfs.core.windows.net/data/Gold/Dim_ABC/_delta_log/*.*' cannot be listed.</i></div><div><i>Msg 4413, Level 16, State 1, Line 1</i></div><div><i>Could not use view or function 'dbo.Dim_ABC' because of binding errors.</i></div></div><div><br /></div><div><br /></div><div>Solution: </div><div><br /></div><div>Step 1: Create a Master Key</div><div><br /></div><div><span face="Calibri, sans-serif" style="font-size: 11pt;">CREATE
MASTER KEY ENCRYPTION BY PASSWORD = '*******'</span></div><div><br /></div><div><b><u>Step 2: Create a Database Scoped Credential with Managed Identity</u></b></div><div><br /></div><p class="MsoNormal">CREATE DATABASE SCOPED CREDENTIAL [PowerBICredential]<o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p><div>WITH IDENTITY = 'Managed Identity' </div><div><br /></div><div><b><u>Step 3: Create a Power BI login</u></b></div><div><br /></div><div><p class="MsoNormal">CREATE LOGIN PowerBI_RO WITH PASSWORD = '*******';<o:p></o:p></p>
<p class="MsoNormal">GO<o:p></o:p></p>
<p class="MsoNormal">CREATE USER PowerBI_RO FROM LOGIN <o:p></o:p>PowerBI_RO</p>
<p class="MsoNormal">GO<o:p></o:p></p>
<p class="MsoNormal">ALTER ROLE db_datareader ADD MEMBER PowerBI_RO;<o:p></o:p></p><p class="MsoNormal"><br /></p><p class="MsoNormal"><b><u>Step 4 -Grant Access to Power BI Scoped Credential to
Power BI login</u></b></p><p class="MsoNormal">GRANT CONTROL TO <o:p></o:p>PowerBI_RO</p><p class="MsoNormal">
</p><p class="MsoNormal">GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::[PowerBICredential]
TO PowerBI_RO<o:p></o:p></p><p class="MsoNormal"><br /></p><p class="MsoNormal"><b><u>Step 5 - Create an External Data Source associated to this credential</u></b></p><p class="MsoNormal">CREATE EXTERNAL DATA SOURCE [DeltaLakeStorage] <o:p></o:p></p><p class="MsoNormal"> WITH (<o:p></o:p></p><p class="MsoNormal"> LOCATION =
'abfss://containername@datalakename.dfs.core.windows.net/' <o:p></o:p></p><p class="MsoNormal"> ,
CREDENTIAL= [PowerBICredential] <o:p></o:p></p><p class="MsoNormal">
</p><p class="MsoNormal"> )<o:p></o:p></p><p class="MsoNormal"><br /></p><p class="MsoNormal"><b>Step 6: Login and run a select statement and you should be able to use this to view delta lake tables.</b></p></div><div><br /></div><div>Update(26th January): Microsoft Synapse Analytics youtube page has recently published a video tutorial on how to do this. - https://www.youtube.com/watch?v=_OZ7XUapJZs </div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-69448917098447325792022-12-12T21:12:00.001+05:302022-12-12T21:12:10.815+05:30Pyspark - Hash() function creates duplicates - solved<p> Aim - To create a unique integer identifier from a column in your pyspark dataframe. </p><p>Issue:- When using the hash() function in pyspark it is very prone to hash collisions as you might see the same hash value for two different source column values, see example below,</p><p><br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjAWLpR1k-AaBjLR77E7Biv1wFo-GIYlCbb1qdFba32254gb06f-L0sdfexEl6TNr_LHdMrwO2s1goqcjU36vRvRSEdGMUex-rNdtx_JNaGmBdVUwusT_FhN6HND9IRrsp205w9vn3ASmJM9Q-pQt3yqli6oi_mXrXVtDoVMcMB9EStX-rJXKuVJIlQ" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="64" data-original-width="442" src="https://blogger.googleusercontent.com/img/a/AVvXsEjAWLpR1k-AaBjLR77E7Biv1wFo-GIYlCbb1qdFba32254gb06f-L0sdfexEl6TNr_LHdMrwO2s1goqcjU36vRvRSEdGMUex-rNdtx_JNaGmBdVUwusT_FhN6HND9IRrsp205w9vn3ASmJM9Q-pQt3yqli6oi_mXrXVtDoVMcMB9EStX-rJXKuVJIlQ=s16000" /></a></div><br /><br /><p></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEj7Y8vZdSMihHpyyLowS5FvzrgJb-FxtcIaHk68iYSmFUG-4WZBCQcXw6MipqAz-qXfRn7fBPlBiSTO-Gmj6MqZvJrgk-_S9u9aCGJlKBqmoQ4SNdplCC_VvHl3ShVInPTTKDjh7NphfIRBmuKUX6uv_9uay4vIpgbgP9bSrFMR0g0TKR-L2PltnejC" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="145" data-original-width="217" src="https://blogger.googleusercontent.com/img/a/AVvXsEj7Y8vZdSMihHpyyLowS5FvzrgJb-FxtcIaHk68iYSmFUG-4WZBCQcXw6MipqAz-qXfRn7fBPlBiSTO-Gmj6MqZvJrgk-_S9u9aCGJlKBqmoQ4SNdplCC_VvHl3ShVInPTTKDjh7NphfIRBmuKUX6uv_9uay4vIpgbgP9bSrFMR0g0TKR-L2PltnejC=s16000" /></a></div><br /><b>Solution</b> - Use the <span style="background-color: #fffffe; font-family: Consolas, "Courier New", monospace; font-size: 13px; white-space: pre;">xxhash64()</span> pyspark function to reduce such hash collisions. See the field now, which has been created with hash 64 algorithm, returning unique values in Product_Id_2<p></p><p><br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEj02c6_GTyUxX_5x7KPjJY_KdqsLXp7SY1Ej1H5fSd0tQOcAJI4caI564Nvd-loEIIfODMk9Yw3F_IkgkiqrkYJ231QprDQZdyb1GfrheSImg49sh_eA32OvNDd343X9tHMoAt48QUrbtFDRtAnCTqV_MpufdZA9hDZSLy1VpcPIdWkMZm4a8SJ4gGV" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="163" data-original-width="391" src="https://blogger.googleusercontent.com/img/a/AVvXsEj02c6_GTyUxX_5x7KPjJY_KdqsLXp7SY1Ej1H5fSd0tQOcAJI4caI564Nvd-loEIIfODMk9Yw3F_IkgkiqrkYJ231QprDQZdyb1GfrheSImg49sh_eA32OvNDd343X9tHMoAt48QUrbtFDRtAnCTqV_MpufdZA9hDZSLy1VpcPIdWkMZm4a8SJ4gGV=s16000" /></a></div><br /><p></p><span style="background-color: #fcff01;">Note:-</span> The new column being created using hash 64 would be a big integer and not an integer.Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-88688136330308060732022-11-23T21:22:00.003+05:302022-11-23T21:22:55.981+05:30Synapse Analytics - Create Delta Lake Database or Schema or Table fails with java.lang.IllegalArgumentException: null<b>Issue </b>- When trying to create a delta lake database or a schema or a delta lake table fails with an error <span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: small;">java.lang.IllegalArgumentException: null path</span> . <div><br /></div><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjL2Vx6r3QE4lXp8ifuatHVN6Hh90kkcrG1yyNW_XalRrDQv3cK1EUQLYoglrJF4QoA_0ALUJl0_6WIr_ut4BRzQ0kVbbO7EFbSEdvr8DBYkKC5jCwefE07WPISgnm8OnexWNrdjbuSQEetoYUX0pHa2KDDvKbIt9htPgXCUkqESVy0GjWAl6ned9_L/s1350/Error_Creating_Lake_Database_Synapse_1.PNG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="624" data-original-width="1350" height="185" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjL2Vx6r3QE4lXp8ifuatHVN6Hh90kkcrG1yyNW_XalRrDQv3cK1EUQLYoglrJF4QoA_0ALUJl0_6WIr_ut4BRzQ0kVbbO7EFbSEdvr8DBYkKC5jCwefE07WPISgnm8OnexWNrdjbuSQEetoYUX0pHa2KDDvKbIt9htPgXCUkqESVy0GjWAl6ned9_L/w400-h185/Error_Creating_Lake_Database_Synapse_1.PNG" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Error sample screenshot</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><br /></div><div><br /></div><div>For a quick background, I am using a small sized cluster with 3-8 worker nodes and using a synapse notebook to execute the syntax. </div><div><br /><div><u><b>Error:</b> </u></div><div><br /></div><div>The error log goes as below,</div><div><br /></div><div><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;"><span style="font-size: x-small;"></span></span><blockquote><span style="font-size: x-small;"><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">Error: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: null path</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:111)</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">org.apache.spark.sql.hive.HiveExternalCatalog.createDatabase(HiveExternalCatalog.scala:193)</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:153)</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:140)</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:45)</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">org.apache.spark.sql.hive.HiveSessionStateBuilder.$anonfun$catalog$1(HiveSessionStateBuilder.scala:60)</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">org.apache.spark.sql.catalyst.catalog.SessionCatalog.externalCatalog$lzycompute(SessionCatalog.scala:133)</span></span></blockquote><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 17.5px;"></span></div><div><br /></div><div>It hints at a path being null. </div></div></div><div><br /></div><div><u><b>Resolution:</b></u></div><div><br /></div><div>The very first thing to check is whether the path you are trying to pass on while creating the table is correct or not, for example see the syntax below,</div><div><br /></div><div><span style="font-size: x-small;"><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">create TABLE lakedatabasename.tablename</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">USING DELTA</span><br style="background-color: white; box-sizing: inherit; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit;" /><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">LOCATION 'abfss:</span><a href="mailto://container@datalake.dfs.core.windows.net" style="background-color: white; box-sizing: inherit; cursor: pointer; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif; outline-color: inherit; outline-style: initial; outline-width: 0px; overflow-wrap: break-word; text-decoration-line: none;">//container@datalake.dfs.core.windows.net</a><span style="background-color: white; color: #333333; font-family: "Segoe UI", SegoeUI, "Helvetica Neue", Helvetica, Arial, sans-serif;">/Path'</span></span></div><div><br /></div><div>Check if the path exists by using the below syntax in a spark notebook, </div><div><br /></div><div>mssparkutils.fs.ls('abfss://container@datalake.dfs.core.windows.net/Path')</div><div><br /></div><div>and this should list down all the files you have under the path,</div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwzjvMmLYaznmVT8yTOHdhphOGHS-3G9HI8sSee-0wDf_vcu85RZj45hX3NWcRBs53YyTj3au1zQvzRSyFrC7saTtB2tALokUHb_mctzWvryhzEdb68bDZcaaQeVwapD6016-ykNWzXGJXykuK1GO8yrLvflUuCD42XD8Olwh_D4gWUUmQ52RRysDj/s1364/list_files_in_deltalake_path.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="202" data-original-width="1364" height="59" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwzjvMmLYaznmVT8yTOHdhphOGHS-3G9HI8sSee-0wDf_vcu85RZj45hX3NWcRBs53YyTj3au1zQvzRSyFrC7saTtB2tALokUHb_mctzWvryhzEdb68bDZcaaQeVwapD6016-ykNWzXGJXykuK1GO8yrLvflUuCD42XD8Olwh_D4gWUUmQ52RRysDj/w400-h59/list_files_in_deltalake_path.png" width="400" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div>If files get listed, it's not an issue with your path and you can rule that out.</div><div><br /></div><div>The next thing to check is if the user you are using to create delta lake database/table has the permission 'BlobStorageDataContributor', this would be needed. <br /><br /></div><div>For me both the above debug steps were okay, later I read that in synapse, delta lake creates metadata for lake database in the default container, and i had actually given the default container same name as my storage account while creation, so I had deleted the default container. </div><div><br /></div><div>This is where the template i extracted during resource creation came in handy for me, I checked the template and found that I had the same name for the 'defaultDataLakeStorageFilesystemName' parameter as well, see screenshot below, </div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRNidXFbhH39sED9tCT2p_2_Q-CCHvPdBo6HYw4egfKktGA4XepgvgK7TqJWx6Mm-7kdEo_mvPQPYgbpkIbKvoAA1IBj5NMul2WxNyCZUh-r99f-IuTKApLlA92e78aArmvAI-ttfMLA0qE_HevJyshqrEkVF1o1MUMpm-TdZhkWjZcC9x1Kxg24AE/s953/Synapse_Workspace_Template.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="356" data-original-width="953" height="239" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRNidXFbhH39sED9tCT2p_2_Q-CCHvPdBo6HYw4egfKktGA4XepgvgK7TqJWx6Mm-7kdEo_mvPQPYgbpkIbKvoAA1IBj5NMul2WxNyCZUh-r99f-IuTKApLlA92e78aArmvAI-ttfMLA0qE_HevJyshqrEkVF1o1MUMpm-TdZhkWjZcC9x1Kxg24AE/w640-h239/Synapse_Workspace_Template.PNG" width="640" /></a></div><br /><div><br /></div><div>The fix was to create the default container name back in, so synapse can create the deltalake database metadata inside the same and re-run the same commands again.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKVrhk9-bzJmMx0ZdZ-hBzcaeCwV-Pf8idWJvAtEftBNcJf7QFKTBsXEKYOfP0lHglFFR1BKgaecVw2psq0m58WpaYs-RcGc7c7BlbPpUQNdA5LWMX8FKhvZRbdD-tpgoWx4CHWSoJM0GmbhK54e9CUKhA9gPwuf9TcylS59EY6VwOF0RiHFq3CKON/s1233/Create_Delta_Lake_Table.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="491" data-original-width="1233" height="159" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKVrhk9-bzJmMx0ZdZ-hBzcaeCwV-Pf8idWJvAtEftBNcJf7QFKTBsXEKYOfP0lHglFFR1BKgaecVw2psq0m58WpaYs-RcGc7c7BlbPpUQNdA5LWMX8FKhvZRbdD-tpgoWx4CHWSoJM0GmbhK54e9CUKhA9gPwuf9TcylS59EY6VwOF0RiHFq3CKON/w400-h159/Create_Delta_Lake_Table.PNG" width="400" /></a></div><br /><div><br /></div><div><br /></div><div>The issue was raised by me and documented here in Microsoft Learn community <a href="https://learn.microsoft.com/en-us/answers/questions/1091727/creating-a-delta-lake-table-from-synapse-anayltics.html" target="_blank">here</a>. </div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-80495657302048370312022-11-21T21:55:00.002+05:302022-11-21T21:55:43.240+05:30Partitioned column data not displaying in External Table over Delta Lake in Serverless SQL Pool<p><b>Issue:</b> You are trying to create an external table over a delta lake folder path in your Azure Data Lake. The said delta lake table is paritioned over a few columns. For example, see below where the delta lake table is paritioned over fields, 'TheYear' and 'TheMonth'. </p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-FXh7NyWppaQh3ROJW7yeHbThV7_hayeSTAd0xVEj-Dtvg3kREscMbUoqaY8M17Utz5jRgXHZZ_nXW_sxaZ0cRsTT29arY9FYAS2uWyH2GFupN5BXlzNeJaxjTDYL4I1b3jJLUHthxgDPla4Gpqlgt1O7aUmdK7jf5EIX7wrSstrvJk30yQFyiOUW/s853/Paritioned_Delta_Lake.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="Delta Lake" border="0" data-original-height="414" data-original-width="853" height="194" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-FXh7NyWppaQh3ROJW7yeHbThV7_hayeSTAd0xVEj-Dtvg3kREscMbUoqaY8M17Utz5jRgXHZZ_nXW_sxaZ0cRsTT29arY9FYAS2uWyH2GFupN5BXlzNeJaxjTDYL4I1b3jJLUHthxgDPla4Gpqlgt1O7aUmdK7jf5EIX7wrSstrvJk30yQFyiOUW/w400-h194/Paritioned_Delta_Lake.JPG" title="Delta Lake path partitioned over Year and Month" width="400" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><p></p><p>The path was then created as an external table in Azure Synapse, and queried upon, where you can see all columns from the delta lake except for the paritioned columns, please see screenshot below,</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCmwzIfOmLkxVJXDLw-7qJil2dha8MyqHs95yEfBTeqcJjJJavmZLNcPhRGeTFOrSAPMTw0K7di_dz2qxj1mc8wJc8rgQOHFsknEMfKC-UPYb6Plyb6lB18RT7TcTasrXSfM3h0fbwo58MU0bpIbxUIADsKqN3a3bMzGzKwgB-qlrum2yBbDOQ8cGT/s516/External_Table_not_displaying_partitioned_column_data.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="368" data-original-width="516" height="228" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCmwzIfOmLkxVJXDLw-7qJil2dha8MyqHs95yEfBTeqcJjJJavmZLNcPhRGeTFOrSAPMTw0K7di_dz2qxj1mc8wJc8rgQOHFsknEMfKC-UPYb6Plyb6lB18RT7TcTasrXSfM3h0fbwo58MU0bpIbxUIADsKqN3a3bMzGzKwgB-qlrum2yBbDOQ8cGT/w320-h228/External_Table_not_displaying_partitioned_column_data.JPG" title="Partitioned Columns not displaying" width="320" /></a></div><br /><p><br /></p><p><b>Cause</b>: This seems to be because this is not a feature, Serverless SQL pool has started to support(<a href="https://learn.microsoft.com/en-us/answers/questions/481624/creating-external-table-over-partitioned-delta-lak.html?childToView=1098334#answer-1098334" target="_blank">reference</a>).</p><p><b>Solution</b>: However there is an alternate way of achieveing the same using a SQL view with OPENROWSET functionality. Then use this view to query the table further.</p><p>sample syntax as below, </p><div style="background-color: #fffffe; font-family: Consolas, "Courier New", monospace; font-size: 14px; line-height: 19px; white-space: pre;"><div> <span style="color: blue;">CREATE</span> <span style="color: blue;">VIEW</span> vName <span style="color: blue;">AS</span> </div><div> <span style="color: blue;">SELECT</span> <span style="color: blue;">TOP</span> <span style="color: green;">10</span> *</div><div> <span style="color: blue;">FROM</span> <span style="color: #863b00;">OPENROWSET</span>(</div><div> <span style="color: blue;">BULK</span> <span style="color: #a31515;">'Gold/Database/Table1/'</span>,</div><div> <span style="color: blue;">DATA_SOURCE</span> = <span style="color: #a31515;">'DeltaLakeStorage'</span>,</div><div> <span style="color: blue;">FORMAT</span> = <span style="color: #a31515;">'delta'</span></div><div> )</div><div> <span style="color: blue;">WITH</span> ( TheYear <span style="color: blue;">VARCHAR</span>(<span style="color: green;">10</span>),</div><div> TheMonth <span style="color: blue;">VARCHAR</span>(<span style="color: green;">10</span>),</div><div> type <span style="color: blue;">VARCHAR</span>(<span style="color: green;">100</span>),</div><div> count <span style="color: blue;">BIGINT</span></div><div> ) </div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDteqZSqO2I1MdciVcRITTkEBIiH1jD5hjUYXrEko8os8qm2B9XQzMvbUMVgjYgK7CA5w48GLpfJmCI9ZC0hzZFfLJbFkQ9STq6mUSwKxv5tlQgMAO6BkuslL-GpwMoNP7zpMUY9ebQH-Xny_bAPS2YnOzlOD2CoJ1PdKNSdsENArwC-4rCs1gHLfY/s496/Openrowset_Delta.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="388" data-original-width="496" height="250" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDteqZSqO2I1MdciVcRITTkEBIiH1jD5hjUYXrEko8os8qm2B9XQzMvbUMVgjYgK7CA5w48GLpfJmCI9ZC0hzZFfLJbFkQ9STq6mUSwKxv5tlQgMAO6BkuslL-GpwMoNP7zpMUY9ebQH-Xny_bAPS2YnOzlOD2CoJ1PdKNSdsENArwC-4rCs1gHLfY/s320/Openrowset_Delta.JPG" width="320" /></a></div><br /><div><br /></div></div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-57980877844960364072022-10-26T20:58:00.000+05:302022-10-26T20:58:01.196+05:30Error 403 - Failed to load resources in Azure Synapse Studio<p><b>Error:</b> User with Owner access at subscription level
getting a failed error while opening Synapse Studio, and not able to see any pipelines, linked service etc</p><p class="MsoNormal"><o:p></o:p></p><p class="MsoNormal"><i>Failed to load one or more resources due to no access, error code 403</i></p><p class="MsoNormal"></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgjDdM1d7nflEj0m5zkGHBMPJ_A-n1VYaWp54ceyuhiAAOl49boEQoTmpVW-G0_8idcSeFS7sbZDwMSmNWwmqKQp-CAwESZevbbxRa6umin6kbzW60fpyCX1UU2Wn2TORSszcyOIi4Z8sd0oGH7kX8t9VP3zYTPKSYT1rfI3zfeJqb1ZbDXHl-ojdqJ" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="552" data-original-width="597" height="345" src="https://blogger.googleusercontent.com/img/a/AVvXsEgjDdM1d7nflEj0m5zkGHBMPJ_A-n1VYaWp54ceyuhiAAOl49boEQoTmpVW-G0_8idcSeFS7sbZDwMSmNWwmqKQp-CAwESZevbbxRa6umin6kbzW60fpyCX1UU2Wn2TORSszcyOIi4Z8sd0oGH7kX8t9VP3zYTPKSYT1rfI3zfeJqb1ZbDXHl-ojdqJ=w374-h345" width="374" /></a></div><br /><br /><p></p><p class="MsoNormal"></p><p class="MsoNormal"><b>Root Cause</b>: Even if you have owner level privileges
at the subscription level, you still have to assign rights inside Synapse
Studio.</p><p class="MsoNormal"><o:p></o:p></p><p></p><p class="MsoNormal"><b>Fix:</b> Go to Synapse Studio > Manage tab</p><p class="MsoNormal"></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgg2JkVlgL1V69dQAXVIEkOmWoOnizrh8MI-T1UyJUpptzpAt0dUnDkLk_fgDdNKgTRhTyuvqvI0U7QdQ7BktHAQIQKQz8t9iHGHH3EjTNVbi2p8dF0uNCYIjbKci3-r8q-oaShyKrHkhwOE5Bs2wcJJyB0hxmBD_zJhtj4rYLg2wdl6OVt9bZdms6O" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="86" data-original-width="84" height="62" src="https://blogger.googleusercontent.com/img/a/AVvXsEgg2JkVlgL1V69dQAXVIEkOmWoOnizrh8MI-T1UyJUpptzpAt0dUnDkLk_fgDdNKgTRhTyuvqvI0U7QdQ7BktHAQIQKQz8t9iHGHH3EjTNVbi2p8dF0uNCYIjbKci3-r8q-oaShyKrHkhwOE5Bs2wcJJyB0hxmBD_zJhtj4rYLg2wdl6OVt9bZdms6O=w61-h62" width="61" /></a></div><br /><p></p><p class="MsoNormal">Go to Access Control > Click on Add</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEilzOI75X3Fm4aJnAWUvo5CKnzt1EIDJNFBUmZvfjXOUhQHwirzLXwogaYCqawiLZ_xOA0so8J4FvLy7RKwmU2N51VZItqHBl6dIQrOpdXfk71wwzxdM4-RTq7tE8mkiSXg40KbQhNK-dETRziLAx-ZLqYhGX6pox1Ek8fufACX8r2NUjnizIzKq-Fa/s605/Azure_Synapse_Studio_Access_Control.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="228" data-original-width="605" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEilzOI75X3Fm4aJnAWUvo5CKnzt1EIDJNFBUmZvfjXOUhQHwirzLXwogaYCqawiLZ_xOA0so8J4FvLy7RKwmU2N51VZItqHBl6dIQrOpdXfk71wwzxdM4-RTq7tE8mkiSXg40KbQhNK-dETRziLAx-ZLqYhGX6pox1Ek8fufACX8r2NUjnizIzKq-Fa/s16000/Azure_Synapse_Studio_Access_Control.png" /></a></div><br /><p class="MsoNormal">Add the user you want to give Administrator access to </p><p class="MsoNormal"><br /></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7Ntl1qlpwJdiFEBCVpB1THA1YjvhHTeCranJLLtQtB7UsNbSFgJEufZM5cwMStkf_r12U19Sb1BwYHQX_apHbqTHHAp23PPGSWwAl7xvO32fk3OqhVc4SK3jIEB2MYP9kq4kZ3JMoVTin-GkWMi-7o053Joh5nWsNovHPfcFz1f0TFwwN2nm-Zj7O/s624/Synapse_Studio_user_role_privilege.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="398" data-original-width="624" height="352" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7Ntl1qlpwJdiFEBCVpB1THA1YjvhHTeCranJLLtQtB7UsNbSFgJEufZM5cwMStkf_r12U19Sb1BwYHQX_apHbqTHHAp23PPGSWwAl7xvO32fk3OqhVc4SK3jIEB2MYP9kq4kZ3JMoVTin-GkWMi-7o053Joh5nWsNovHPfcFz1f0TFwwN2nm-Zj7O/w552-h352/Synapse_Studio_user_role_privilege.PNG" width="552" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div>Now log out and login back to Azure Portal and back to Synapse Studio, now you would be able to see all the resources. Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-52764550480457350822022-10-21T13:40:00.002+05:302022-10-26T20:58:27.538+05:30Authorization Failed error while creating Azure Data Factory Instance<p><span style="font-family: georgia;"><b>Issue description</b>: New user not able to create an Azure Data Factory instance despite having owner or contributor group access at resource group level.</span></p><p><span style="font-family: georgia;"><b>Error</b> - <i>The client '' with object id '' does not have authorization to perform action '' over scope '/subscriptions/' or the scope is invalid. If access was recently granted, please refresh your credentials. (Code: AuthorizationFailed) </i></span><i style="font-family: georgia;">(Code: AuthorizationFailed) </i><i style="font-family: georgia;"> </i></p><p><span style="font-family: georgia;"><b>Root Cause</b> - The root cause of this issue is that, when you setup a subscription, only a default set of Microsoft Providers are by default registered. </span></p><p><span style="font-family: georgia;"></span></p><div class="separator" style="clear: both; text-align: center;"><span style="font-family: georgia;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEggXIFJnlJohr_xSiRlvsnu35aH3ROgFdcIQHjY1b7z5GGqFch8gRNv67EC4Hk9Bov2aeHHfhS95lmyeEszMrHB9A24U6Hf8ffTfcExMsvVFxx8ui9cFx8eu8pp-dHuiwA3t3kI06c8uI6F1fUSk-002_bf9ROa-mmtmJobSTtjvbwOCYziIeBloyoG" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="273" data-original-width="945" height="186" src="https://blogger.googleusercontent.com/img/a/AVvXsEggXIFJnlJohr_xSiRlvsnu35aH3ROgFdcIQHjY1b7z5GGqFch8gRNv67EC4Hk9Bov2aeHHfhS95lmyeEszMrHB9A24U6Hf8ffTfcExMsvVFxx8ui9cFx8eu8pp-dHuiwA3t3kI06c8uI6F1fUSk-002_bf9ROa-mmtmJobSTtjvbwOCYziIeBloyoG=w640-h186" width="640" /></a></span></div><span style="font-family: georgia;"><br /><br /></span><p></p><p><b style="font-family: georgia;">Fix:</b><span style="font-family: georgia;"> Go to Subscriptions > Resource Providers and add/register the 'Microsoft.DataFactory' provider. </span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgFMNUQs6mCC66M9MTC-fl9OzOiy1TC1w0tULiGZ_BrFvKbGok-WzjRGuqY24ZxNf3Cmnry4E5ecTapttbJJtUx_05M_xYyB_F0d8sJN6Kt3HrgeR1P_MG29t9vlyr42Y_BYFyK1Y7PYFoaj1Y6BrrSAKvLxZ6A4abr1mitw0z60w0rgYF8dk9pqN5t" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="193" data-original-width="884" height="141" src="https://blogger.googleusercontent.com/img/a/AVvXsEgFMNUQs6mCC66M9MTC-fl9OzOiy1TC1w0tULiGZ_BrFvKbGok-WzjRGuqY24ZxNf3Cmnry4E5ecTapttbJJtUx_05M_xYyB_F0d8sJN6Kt3HrgeR1P_MG29t9vlyr42Y_BYFyK1Y7PYFoaj1Y6BrrSAKvLxZ6A4abr1mitw0z60w0rgYF8dk9pqN5t=w640-h141" width="640" /></a></div><br /><br /><p></p><p><br /></p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-11943782967460693422021-12-14T22:25:00.002+05:302021-12-14T22:25:25.650+05:30PowerBI - Synapse Refresh Issue - 'Key didn't match any rows in table'<p><b>Issue: </b>PowerBI was not able to refresh the underlying dataset from Azure Synapse in a higher environment in Import mode.</p><p>Sample error screenshot is below,</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgZgKvKYDNJA-uFMZf7ow81ZuiVf0Qc2hPYQ-thRrjGP2pvDKxr4XuqWji81qucoHDb95ScX6utn5c74SQiQbJKm_NWky8E3e4LcS4jPPAq9cjKbkdbx6cVyzF--USbkX9I1fNVUSUmiWBQn-3HHbIa7L1VD83apColIoE0e53uR2UK7GajQSFtrjbk=s1097" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="Error Screenshot PowerBI refresh" border="0" data-original-height="337" data-original-width="1097" height="98" src="https://blogger.googleusercontent.com/img/a/AVvXsEgZgKvKYDNJA-uFMZf7ow81ZuiVf0Qc2hPYQ-thRrjGP2pvDKxr4XuqWji81qucoHDb95ScX6utn5c74SQiQbJKm_NWky8E3e4LcS4jPPAq9cjKbkdbx6cVyzF--USbkX9I1fNVUSUmiWBQn-3HHbIa7L1VD83apColIoE0e53uR2UK7GajQSFtrjbk=w320-h98" width="320" /></a></div><br /><p><b>Cause :</b> The issue was caused when in the higher environment, the schema name was created in upper case letters where as in the lower environment it was in lower case letters. </p><p><b>Resolution:</b> The fix in this case is to move objects from the existing uppercase schema to lower case, this will incur some effort if you are doing it, since first you will have to move it to a temporary schema name, then drop the uppercase schema name, create lowercase schema name and then move from the temporary schema to lowercase schema name.</p><p>Another option is to parameterize the schema in PowerBI and change at one place when you migrate to higher environment.</p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-57773607103457658932021-11-28T13:27:00.002+05:302021-11-28T19:36:53.922+05:30 Guidewire Datahub Architecture Explained<p>Guidewire Datahub is a P&C enterprise data warehouse solution built on top of the three key Guidewire Insurance Suite systems namely Guidewire Policycenter, Billingcenter and Claimcenter. It is a ETL framework that has been built using SAP BODS ETL tool and a java framework named DIAL for SCD Type 1 and Type 2 implementation. </p><p>Focused on Property and Insurance clients, Datahub provides an operational datastore which provides a single version of truth and enabling clients to execute cross analytical queries across systems. The system also provides you capability to source data from external data systems. The layers of the architecture enables data quality and auditing to make sure that the data you have in target is clean and audited. </p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoE0H2qhx8CMoHRqP0xwiHusq77qzicJ56WC-wFwD1OM_UDFNZMgdjdhfuCrKRW3yiX5CDVJGllK7odwofHdbiEJnvkBUOrhW3lNbOcpqodO6NV-ZLroy3Ekku5C3tswZSJOPOMiibsTg/s854/Guidewire_Datahub_Architecture.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="371" data-original-width="854" height="174" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoE0H2qhx8CMoHRqP0xwiHusq77qzicJ56WC-wFwD1OM_UDFNZMgdjdhfuCrKRW3yiX5CDVJGllK7odwofHdbiEJnvkBUOrhW3lNbOcpqodO6NV-ZLroy3Ekku5C3tswZSJOPOMiibsTg/w400-h174/Guidewire_Datahub_Architecture.jpg" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Guidewire Datahub Basic Architecture</td></tr></tbody></table><br /><p>The first layer in the architecture are the sources, which can be either one of the three sources from PC, BC or CC, it can even be an external source or a legacy system. In this layer the data from the sources are loaded into a ftp source and from the ftp source to the relational database where Guidewire Datahub resides. </p><p>The second layer is the Guidewire Datahub layer which is a relational dbms and can be in either Oracle Database or a SQL Server database. In this layer data is captured at an entity level in both Type 1 and Type 2. All the data cleansing, quality checks, balancing an auditing happens at this layer. This layer is loaded and managed using the SAP BODS or SAP Data Services tool.</p><p>The next layer is where you built out your star schema diagrams based on your reporting needs, this layer will have your facts and dimensions and will be used by IBM Congnos Analytics to cater to your reporting needs.</p><p>This forms the basic architecture of the Guidewire Datahub suite. Please do let me know if you would like me to cover more topics on the Guidewire Datahub suite.</p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-23893915735839841702021-09-15T11:15:00.001+05:302021-09-15T11:35:06.740+05:30Databricks Cluster termination due to Cloud Provider Launch Failure - Allocation Failed<p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5ZcZJ20gBUnT7S_Jp8NUDcA9yzjX6QNTfp1Xkkm1sNb-GUe85kUOnLVIHua3KFSNQ0zZ33gbaBYZxYBKzpriJf8t7ZX_8mi-pJtS5XQ28-MaC1QbbDYW2A7fIKD8LnN6LuXKTE-WMhy8//" style="margin-left: 1em; margin-right: 1em;"><img alt="Databricks Cluster Termination due to lack in Allocation of Resources" data-original-height="375" data-original-width="823" height="259" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5ZcZJ20gBUnT7S_Jp8NUDcA9yzjX6QNTfp1Xkkm1sNb-GUe85kUOnLVIHua3KFSNQ0zZ33gbaBYZxYBKzpriJf8t7ZX_8mi-pJtS5XQ28-MaC1QbbDYW2A7fIKD8LnN6LuXKTE-WMhy8/w569-h259/image.png" width="569" /></a></div><br /> <p></p><p>Issue: Restarting a terminated cluster gives the below error message</p><p><span color="rgba(0, 0, 0, 0.8)" face=""Helvetica Neue", Helvetica, Arial, sans-serif, FontAwesome" style="background-color: white; font-size: 13px; font-weight: 700; white-space: pre-wrap;"></span></p><blockquote><p><span color="rgba(0, 0, 0, 0.8)" face=""Helvetica Neue", Helvetica, Arial, sans-serif, FontAwesome" style="background-color: white; font-size: 13px; font-weight: 700; white-space: pre-wrap;"><i>Cluster terminated.Reason:Cloud Provider Launch Failure</i></span></p><p style="background-color: white; color: rgba(0, 0, 0, 0.8); font-family: "Helvetica Neue", Helvetica, Arial, sans-serif, FontAwesome; font-size: 13px; margin: 0px 0px 10px; white-space: pre-wrap;"></p><p></p><p></p><p></p><div><i>A cloud provider error was encountered while launching worker nodes. See the<a href="https://docs.microsoft.com/azure/databricks/clusters/clusters-manage#cluster-terminate" style="color: var(--link-text-color); cursor: pointer; text-decoration: none;"> Databricks guide</a> for more information.</i><div><i><br />Azure error code: AllocationFailed</i></div><div><i><br />Azure error message: Allocation failed. We do not have sufficient capacity for the requested VM size in this region. Read more about improving likelihood of allocation success at http://aka.ms/allocation-guidance</i></div></div></blockquote><div><div></div></div><p><br /></p><p><b>Cause</b>: Allocation of resources failing as Azure cloud doesn't have requested VM size in the region. This is a region specific error when the requested resources are too much for the region and doesn't have additional resources to get the cluster running.</p><p><b>Resolution:</b> Microsoft has documented the resolutions here in this <a href="https://docs.microsoft.com/en-us/troubleshoot/azure/virtual-machines/allocation-failure" target="_blank">article</a>.</p><p><b style="background-color: #fcff01;">Update</b>:- After waiting a while and a couple of restarts later, it started running again for us.</p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-44639737527174774122021-08-28T08:40:00.001+05:302021-08-28T08:40:08.924+05:30Understanding Parquet file format - What, Why, How and When - Explained<p><b><u></u></b></p><div class="separator" style="clear: both; text-align: center;"><b><u><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEij_yBH-U82_j0SKiTUZP-uDFX7K8ANaWUGSvGl0ezz3HWbIrYh4liOAI3al3CfTgDsuiMu8F66YeLCRUD7mVP8_0mNvgFwNAZxr8B94l0LL25SRurirtqdemeSIg9dVgcBcHVCGdZU-FY//" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="176" data-original-width="703" height="80" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEij_yBH-U82_j0SKiTUZP-uDFX7K8ANaWUGSvGl0ezz3HWbIrYh4liOAI3al3CfTgDsuiMu8F66YeLCRUD7mVP8_0mNvgFwNAZxr8B94l0LL25SRurirtqdemeSIg9dVgcBcHVCGdZU-FY//" width="320" /></a></u></b></div><b><u><br /></u></b><b><u>What is Parquet Format ?</u></b><p></p><p>Parquet is a columnar(column-oriented) data format, which was initially built to support the Apache Hadoop ecosystem. It is a read optimized format, which allows efficient reads and efficient compression algorithms.</p><p><b><u>Why Parquet ?</u></b></p><p>When you have a dataset that you want to query very efficiently and improve query pruning, Parquet is one file format you could go for. Being a columnar storage option, it greatly improves upon the reads and minimizes Input Output operations on the data.</p><p>Below are the summarized advantages of Parquet files,</p><p><span> </span>1.<span> Efficient query pruning - With parquet format, columns that are not required while reading the dataset is efficiently pruned. This minimizes the IO operations, there by fast and efficient.</span></p><p>2. Able to process from large volumes of data, and has efficient data compression/encoding mechanisms, there by helping to reduce storage costs as well.</p><p>3, Parquet format, though initially built in the Apache Hadoop ecosystem, works well with most data lake storage systems, ADLS, S3, and is supported as external tables in almost all new generation columnar data warehouses.</p><p>4. Since parquet is read optimized and a column oriented data format, it is very efficient for data storage in OLAP workloads. Since in OLAP workloads, mostly we only fetch a subset of columns.</p><p><b><u>How is Parquet stored on disk ?</u></b></p><p>Below is an example of a parquet file being stored in an Azure Data Lake Gen2 system. The source dataset was small, hence you see just one file.</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiRICxCnwfKIgQQKYZQzuB28Q8mvj6nsZTlv91wDbx0Ig9FfnZTwvqWz_giyD6cwHUE4raxw5C3oQNBQt6Z2eM6oP4hNrNOVVQ_C_qq_HXGAC1ocyPfaKvp10KbCUYvr6MXeLxq3hI1Xs//" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="126" data-original-width="747" height="54" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiRICxCnwfKIgQQKYZQzuB28Q8mvj6nsZTlv91wDbx0Ig9FfnZTwvqWz_giyD6cwHUE4raxw5C3oQNBQt6Z2eM6oP4hNrNOVVQ_C_qq_HXGAC1ocyPfaKvp10KbCUYvr6MXeLxq3hI1Xs//" width="320" /></a></div><br />There are however other way storing where you have a root folder for your parquet and inside the root folder you have multiple part files which split and store the data in the parquet file, something as below,<p></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxzD3uPF5BPCsqpW3L5JdKl9HzpHJqt-m2aWTPG7VcF2W4UR6o5erAop5w6-o22ESk7sFCBQtXoUWI_L9fTBeVxjyfZa_YyLCXrtI7ZKkQk4xJRpcsDMmJABF3oqs31pqvtmMFzlUy07U//" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="413" data-original-width="1449" height="114" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxzD3uPF5BPCsqpW3L5JdKl9HzpHJqt-m2aWTPG7VcF2W4UR6o5erAop5w6-o22ESk7sFCBQtXoUWI_L9fTBeVxjyfZa_YyLCXrtI7ZKkQk4xJRpcsDMmJABF3oqs31pqvtmMFzlUy07U/w400-h114/image.png" width="400" /></a></div><br /><br /><p></p><p>The lesser part(partition) files you have, the faster your read performance would become. Parquet in spark allows setting the number of partitions(manual partitioning), the ideal scenario here is having a number of partitions somewhere between not very less and not many.</p><p><b><u>What are the compression schemes in Parquet</u></b> ?</p><p>The most widely used compression scheme for parquet is the Snappy compression mechanism which can compress the original file to a smaller one, hence also contributing to the IO efficiency. Snappy compressed parquet files significantly improve on read performance than parquet files without encoding.</p><p><u><b>When should I use parquet ?</b></u></p><p>Parquet should be considered when you have an OLAP use case, and ideally where you have a subset of columns for a problem, the more the number of columns and unique values you have in the column, the lower the performance of parquet. So consider parquet where you have operations that happen on a subset of columns and where you are doing aggregations.</p><p><b><u>References and Further reads</u></b></p><p>I found some great beginner articles for the parquet format online, sharing the same below if you are interested to learn more,</p><p></p><ul style="text-align: left;"><li><a href="https://databricks.com/glossary/what-is-parquet" target="_blank">Databricks - What is Apache Parquet ?</a></li><li><a href="https://www.youtube.com/watch?v=1j8SdS7s_NY&t=45s">Parquet Files and Optimization - Youtube</a></li><li><a href="https://spark.apache.org/docs/latest/sql-data-sources-parquet.html" target="_blank">Apache Spark documentation on Spark</a></li></ul><p></p><p><br /></p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-61723059199380307642021-08-16T23:54:00.001+05:302021-08-16T23:54:28.559+05:30Not authorized to view container in Azure Blob under Storage Account<p> Error - Below is the error description from inside newly created container under storage account we created</p><ul style="background-color: white; color: #323130; font-family: az_ea_font, "Segoe UI", az_font, system-ui, -apple-system, BlinkMacSystemFont, Roboto, Oxygen, Ubuntu, Cantarell, "Open Sans", "Helvetica Neue", sans-serif; font-size: 13px; outline: none !important;"><li style="font-weight: inherit; outline: none !important;"><span style="font-weight: inherit; outline: none !important; word-break: break-word;">This request is not authorized to perform this operation. RequestId:342dca41-601e-004f-0dc9-9291aa000000 Time:2021-08-16T18:10:08.0322108Z</span> </li><li style="font-weight: inherit; outline: none !important;"><span style="font-weight: inherit; outline: none !important; word-break: break-word;">This storage account's 'Firewalls and virtual networks' settings may be blocking access to storage services. Try adding your client IP address ('xx.xx.xx.xxx') to the firewall exceptions, or by allowing access from 'all networks' instead of 'selected networks'</span></li></ul><p><br /></p><p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjACqYhwybeETbyyPGsmKQDgDm7wvQidlLmiQGJgUJcqc7YZkRW3FH5ytm8NJ7nnFLmtfRuDNQgwWuVtiPo_Gcm3YegUSjaGk4aurX9syHTgxlrNwQh9aGXb6-hU94_AY5Vlje0JYdk4S4//" style="margin-left: auto; margin-right: auto;"><img data-original-height="683" data-original-width="1128" height="243" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjACqYhwybeETbyyPGsmKQDgDm7wvQidlLmiQGJgUJcqc7YZkRW3FH5ytm8NJ7nnFLmtfRuDNQgwWuVtiPo_Gcm3YegUSjaGk4aurX9syHTgxlrNwQh9aGXb6-hU94_AY5Vlje0JYdk4S4/w400-h243/image.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"></td></tr></tbody></table><br /><br /></p><p>Cause: The client address of your machine is not added as an exception to your firewall settings under the parent storage account</p><p><br /></p><p>Solution: </p><p>1. The solution is to add your ip address as an exception under 'Networking' > 'Firewall and Virtual Networks' </p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMtwYBclQD1bocBTakxRcY_eaQxFRDUb_2mkaJpPx7wVxEXhEEUEJtqJRIpi1bh8UG4QQdzABl167OrT_pbO8Dz67wT-RRJNzmGMm2mNlbZeVz05FZCr-w22sciI3wnlcXlVeZnWCFwwc//" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="357" data-original-width="385" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMtwYBclQD1bocBTakxRcY_eaQxFRDUb_2mkaJpPx7wVxEXhEEUEJtqJRIpi1bh8UG4QQdzABl167OrT_pbO8Dz67wT-RRJNzmGMm2mNlbZeVz05FZCr-w22sciI3wnlcXlVeZnWCFwwc//" width="259" /></a></div><br /><br /><p></p><p>2. Click on 'Add your clinet IP Address ( xx.xx.xxx.xxx)</p><p><br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOmybv3rPT3p4SSHBGgWNzb5BztjUTI0cC6DWd8NYps4ftjjdTL2PsUkJYAnT5Mg-n2kTH2CcVnzV8uJBCZfWWj6OxMYl80PpMvSpz-czry8uccOUYlw1YBUyYGMhh-1ZbN_oZnAhS7kA//" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="237" data-original-width="852" height="111" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOmybv3rPT3p4SSHBGgWNzb5BztjUTI0cC6DWd8NYps4ftjjdTL2PsUkJYAnT5Mg-n2kTH2CcVnzV8uJBCZfWWj6OxMYl80PpMvSpz-czry8uccOUYlw1YBUyYGMhh-1ZbN_oZnAhS7kA/w400-h111/image.png" width="400" /></a></div><br /><br /><p></p><p>3. Now click on 'Save' in the top left corner under 'Firewall and Virtual Networks'</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgICWwtR4v6dcEzTcXfwUtknSyHdfc4akv7i93Ni5hzvUpYZhxnXaNoaWowatM3OxBthwSzBDyFuVRVRMsKMowzK_oiEq07m5Gg4aaCNW9qM12K8UHVAN10FENOm7NidwND-tasRwoIksc//" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="158" data-original-width="518" height="98" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgICWwtR4v6dcEzTcXfwUtknSyHdfc4akv7i93Ni5hzvUpYZhxnXaNoaWowatM3OxBthwSzBDyFuVRVRMsKMowzK_oiEq07m5Gg4aaCNW9qM12K8UHVAN10FENOm7NidwND-tasRwoIksc//" width="320" /></a></div><br />4. Now navigate back to the container and you should be able to access the container.<p></p><p><br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiM3LZzIyFHCMgG7-rxUGovlebHaqCsse2iVj2Cwr1HJL29BrajEAJQwPhIe72htP9Jo0SCd3OXTtogjww-nVpSSN4ie4Cr2JcS5yY0J2sdayv0s8jU9jAk5zRSMoEwH_w73naVf4kxHE//" style="margin-left: 1em; margin-right: 1em;"><img data-original-height="500" data-original-width="1026" height="195" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiM3LZzIyFHCMgG7-rxUGovlebHaqCsse2iVj2Cwr1HJL29BrajEAJQwPhIe72htP9Jo0SCd3OXTtogjww-nVpSSN4ie4Cr2JcS5yY0J2sdayv0s8jU9jAk5zRSMoEwH_w73naVf4kxHE/w400-h195/image.png" width="400" /></a></div><br /><br /><p></p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-39777990028656819892021-08-16T23:36:00.003+05:302021-08-16T23:36:46.316+05:30Error 22300 - Azure Data Factory Linked Service cannot connect to Azure SQL Database<p><b> Error: </b> <span style="background-color: white; color: #323130; font-family: "Segoe UI"; font-size: 13px;"><i>Error Code 22300 - Cannot connect to SQL Database: 'db-learn-azure.database.windows.net', Database: 'learn-azure-athul', User: 'athul'. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access. Cannot open server 'db-learn-azure' requested by the login. Client with IP address '20.xx.xx.xxx' is not allowed to access the server.</i></span></p><p><br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixPov4yVV8eEIXjJrJ9lIAClsqWccDV7FVui0tyseZBkSjdnN2QUNpwgs9C6P2uRJIrDAO19GNVz2dpl756ZSdbKVuNI63saHdpiOc_Qf3mhc7xUtx3XbEIZgskf-GquXE2RqlgrckM_E//" style="margin-left: 1em; margin-right: 1em;"><img alt="Error Code 22300" data-original-height="450" data-original-width="635" height="227" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixPov4yVV8eEIXjJrJ9lIAClsqWccDV7FVui0tyseZBkSjdnN2QUNpwgs9C6P2uRJIrDAO19GNVz2dpl756ZSdbKVuNI63saHdpiOc_Qf3mhc7xUtx3XbEIZgskf-GquXE2RqlgrckM_E/w320-h227/image.png" title="Firewall error ADF and Azure SQL Database" width="320" /></a></div><br /><br /></div><br /><b>Cause:</b> This happens because the IP address of the Azure Data Factory instance is not known by the Azure SQL Database.<p></p><p><b>Resolution:</b> </p><p>1. Go to the Azure SQL Database instance in Microsoft Portal > Go to Firewall Settings > </p><p>2.<span> Change the 'Allow Azure Services and Resources to access this server' to 'Yes' and save the firewall settings</span></p><p><br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh76dDITlkccWXvBBhl3qpoK3MQ5g0farWOEeQqLePuF0rZiTtfIaRljYRGVH9x_jpT0b4iP3Eh7vSFfgAmxKcU_961sGjpIBlm2Y79FhYi68iywTMyMZtwq5NbJ2yVjua3sLUq2ysyNSc//" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="345" data-original-width="621" height="178" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh76dDITlkccWXvBBhl3qpoK3MQ5g0farWOEeQqLePuF0rZiTtfIaRljYRGVH9x_jpT0b4iP3Eh7vSFfgAmxKcU_961sGjpIBlm2Y79FhYi68iywTMyMZtwq5NbJ2yVjua3sLUq2ysyNSc//" width="320" /></a></div><br />3. Now try connecting again.<p></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMhXXfhhMo45FpWHKAWrDQ1dXGGUaQMrbYG3LH27K-vimY2nKZbhF9pxx0lumsBOkqG27MflwwsAobzEJv6jlUfdLhYK4_4aoM7ceRB7Kdyb1jiyn3mNeCpmSno5QF6at9qhrjZV680HI//" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="463" data-original-width="891" height="208" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMhXXfhhMo45FpWHKAWrDQ1dXGGUaQMrbYG3LH27K-vimY2nKZbhF9pxx0lumsBOkqG27MflwwsAobzEJv6jlUfdLhYK4_4aoM7ceRB7Kdyb1jiyn3mNeCpmSno5QF6at9qhrjZV680HI/w400-h208/image.png" width="400" /></a></div><br /><br /><p></p>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-61427093476730147442021-07-06T19:21:00.002+05:302021-07-06T19:21:25.245+05:30Synapse - CREATE TABLE as SELECT with Distribution - Not recognized error [Solved]<p><span style="font-family: georgia;"><b>Scenario:</b> While trying to create a backup of a table in one Synapse schema into another backup table in another schema, </span></p><p><span style="font-family: trebuchet;">example syntax below,</span></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZO-hZKsbFnkyUqvwX8x4YmQnUvXyFzRGaZvjSdJ_97H-llAroWPD-OIh20yXKtmo6CBL94mTxTBoSRFUvZSjF29eqv9k62BGQIv-eCtakXT6gbgjRsYlT2UJGoWtX2CMUXNyBOWnuMxM/s926/Azure_Synapse_Create_Table_Select_Distribution_Error.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="254" data-original-width="926" height="177" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZO-hZKsbFnkyUqvwX8x4YmQnUvXyFzRGaZvjSdJ_97H-llAroWPD-OIh20yXKtmo6CBL94mTxTBoSRFUvZSjF29eqv9k62BGQIv-eCtakXT6gbgjRsYlT2UJGoWtX2CMUXNyBOWnuMxM/w644-h177/Azure_Synapse_Create_Table_Select_Distribution_Error.JPG" width="644" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><span style="color: #cc0000; font-family: trebuchet; font-size: small;">Msg 103022, Level 16, State 1, Line 4</span><div><span style="color: #cc0000; font-family: trebuchet; font-size: x-small;">Parse error at line: 2, column: 8: DISTRIBUTION is not a recognized table or a view hint.Msg 103022, Level 16, State 1, Line 4 </span></div><div><span style="color: #cc0000; font-family: trebuchet; font-size: small;">Parse error at line: 2, column: 8: DISTRIBUTION is not a recognized table or a view hint.</span><p><br /></p><p><span style="font-family: trebuchet;"><b>Solution:</b> To resolve the issue change the syntax from existing </span></p><p></p><blockquote><p><span style="font-family: trebuchet;">CREATE TABLE mySchema.bkp_Students AS SELECT * FROM enrollment.Students</span></p><p><span style="font-family: trebuchet;">WITH ( DISTRIBUTION=HASH(Student_id)</span></p><p><span style="font-family: trebuchet;">CLUSTERED COLUMNSTORE INDEX ) </span></p></blockquote><p></p><p><span style="font-family: trebuchet;"><br /></span></p><p><span style="font-family: trebuchet;">to</span></p><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><p style="text-align: left;"><span style="font-family: trebuchet;">CREATE TABLE mySchema.bkp_Students </span></p><p style="text-align: left;"><span style="font-family: trebuchet;">WITH ( DISTRIBUTION=HASH(Case_id),</span></p><p style="text-align: left;"><span style="font-family: trebuchet;">CLUSTERED COLUMNSTORE INDEX ) </span></p><p style="text-align: left;"><span style="font-family: trebuchet;">AS SELECT * FROM enrollment.Students</span></p></blockquote><p><span style="font-family: trebuchet;"><br /></span></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxVsm0sFzYUKjx1-7GhzJ5DBj3YnlXy64wi9RGH5NO1_5FqfmqpcQ1V7CJYI0G2ZLAPmPVfSUxsmWcPb5urLKxKuedO8mXb6NgL4UjJlHhSFo_rTniGnCzAyBX-imw26JHm7oEbmCCmzQ/s594/Azure_Synapse_Create_Table_Hash_before_Insert.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: trebuchet;"><img border="0" data-original-height="286" data-original-width="594" height="247" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxVsm0sFzYUKjx1-7GhzJ5DBj3YnlXy64wi9RGH5NO1_5FqfmqpcQ1V7CJYI0G2ZLAPmPVfSUxsmWcPb5urLKxKuedO8mXb6NgL4UjJlHhSFo_rTniGnCzAyBX-imw26JHm7oEbmCCmzQ/w513-h247/Azure_Synapse_Create_Table_Hash_before_Insert.JPG" width="513" /></span></a></div><span style="font-family: trebuchet;"><br /></span><p>A table needs to have a default distribution before data can be inserted into it.</p></div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-76061793774700362262021-06-13T22:22:00.000+05:302021-06-13T22:22:15.332+05:30Learning plan to clear DP 900 - Azure Data Fundamentals Certification<p></p><p><a href="https://www.credly.com/badges/df896212-10aa-4d76-9a9a-0da42a42220c/public_url" imageanchor="1" style="clear: left; float: left; font-family: georgia; margin-bottom: 1em; margin-right: 1em;" target="_blank"><img alt="Azure Data Fundamentals DP 900 Badge" border="0" data-original-height="600" data-original-width="600" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHOMCIqWIuwGzrZT9oeqGNW8t7AB1c2ANLhrSywJlTsP-cM5MifGQS0mV0lt77i65KySkWAcZW8RkpQUCgA0z5KeUfw9uuEVrT8Qat9NonRruU-9n2M9X0T7qZC_5GWNwdqAA84N0-ILI/w200-h200/azure-data-fundamentals_DP_900_Badge.png" title="Azure Data Fundamentals DP 900 Badge" width="200" /></a></p><p></p><span style="font-family: georgia;">After having worked on couple of small data engagements in Azure Cloud, I have started my journey towards achieving a Azure Data Engineer Associate Badge. As a first, i started with DP 900, which is a beginner data fundamental course which focuses on core concepts of data and data services available in Azure. The certification excepts you to have a basic understanding of all key services and concepts in Data warehousing and Visualization.</span><p></p><p><span style="font-family: georgia;">Like any other online certification exam, 30 minutes before the actual schedule the test link starts to become active. </span><span style="font-family: georgia;">A proctor(invigilator) will join you on chat and verify your government ID details and the surroundings of your test room, which you have to take a screenshot of. Once they are convinced that you are a fair candidate, They initiate the exam. </span></p><div><span style="font-family: georgia;">Coming to the exam, I had a total of 49 questions, and was fairly easy considering I have some experience on the topic and that i had only prepared just a a day ahead. I was able to score 922 in the exam easily with a day of preparation.</span></div><div><span style="font-family: georgia;"><br /></span></div><div><span style="font-family: georgia;">Below are the materials that I used to prepare, </span></div><div><br /></div><div><ul><li><font face="georgia">FreeCodeCamp.Org crash course in Youtube - Partnered with ExamPro.com</font></li></ul></div><div><font face="georgia">I went through this fairly basic video course in FreeCodeCamp. It is not accurate or exhaustive enough if you want to go to details, but for every Azure service or data concept, they do cover the basic functionalities, and at the end of each lessson, they have a cheat sheet which I found really helpful for a last minute read. </font></div><div><font face="georgia"><br /></font></div><div><font face="georgia">Some parts are a little vaguely explained, so suggest going for the Udemy course below if you need a deeper understanding.</font></div><div><font face="georgia"><br /></font></div><div><div class="separator" style="clear: both; text-align: center;"><iframe allowfullscreen="" class="BLOG_video_class" height="266" src="https://www.youtube.com/embed/P3qmqUZJ7l0" width="320" youtube-src-id="P3qmqUZJ7l0"></iframe></div></div><div style="text-align: center;"><br /></div><div style="text-align: center;"><font face="georgia"><br /></font></div><div style="text-align: center;"><ul><li style="text-align: left;"><font face="georgia">Practice questions in Udemy Course - <a href="https://www.udemy.com/course/azure-dp-900/">DP-900: Microsoft Azure Data Fundamentals Video Course + Q</a>us </font></li></ul><div style="text-align: left;"><font face="georgia"><span> </span><span> This is an absolutely wonderful course. If you really have the time, do go through this course. More than the course, there are two practice set of questions at the end of this course. First one with around 93 questions and second with 100 questions. The questions I got on the exam were mostly connected to the questions on this course. Maybe a little re-worded or arranged from the course, but the basic content remains the same. I did both these question sets around 2 to 3 times just a few hours before on exam day and made sure I consistently got 90%+ scores. </span></font></div><div style="text-align: left;"><font face="georgia"><span><br /></span></font></div><div style="text-align: left;"><div><font face="georgia">To summarize, if you are running short on time. Start with the 3 hour FreeCodeCamp course. Thoroughly take notes and understand. Once you are done with the course, repeat, repeat, and again repeat these practice questions from the udemy course until you consistently get a 90%+ score.</font></div><div><font face="georgia"><br /></font></div><div><span style="font-family: georgia;">Happy Learning. Let us know if you were able to complete the Certification following this approach.</span></div></div></div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0tag:blogger.com,1999:blog-2169055826346378257.post-66349309778617676292020-11-21T17:19:00.002+05:302020-11-21T17:19:50.745+05:30Get Error log for child pipelines in an execute pipeline activity in ADF<p><b>Scenario: </b>We have a pipeline in Azure Data Factory, where we have a few lookup activities and then sequentially execute another child pipeline using the execute pipeline activity. see the flow below,</p><p>The screenshot below is the parent pipeline, where you have two lookups and a sequential execute pipeline step which calls another child pipeline.</p><p><br /></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigk_DksQkuZHBgUGjNDo8xDepv3THH9sZuCT4dWh8t-2daO3XSGIrz1MhwPnmoP_DbHDaJfPpmAVo3QUea_TM1vS28zZa6SYFrW6oR0hRSsKs4TeZPUL2E0cOebZL9S9mU7X2BXv7N3ME/s482/ADF_Pipeline_1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="238" data-original-width="482" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigk_DksQkuZHBgUGjNDo8xDepv3THH9sZuCT4dWh8t-2daO3XSGIrz1MhwPnmoP_DbHDaJfPpmAVo3QUea_TM1vS28zZa6SYFrW6oR0hRSsKs4TeZPUL2E0cOebZL9S9mU7X2BXv7N3ME/s16000/ADF_Pipeline_1.jpg" /></a></div><div><br /></div><div><br /></div>See below the child pipeline where you have lookup and a ForEach1 loop,<div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyM1_gH28EgIwbMZ6YIP6TjP3rHBlJAs9NeKxK36ZPqtg48aGCFA62HFbJVft6PcECslPIQwtCtGfvteM84Ofg8CJt9cyGTXAygV3kydFk-GB52JEr1RcOYY_jliRzixP-Xq73Wz6lgS0/s509/ADF_Child_Pipeline.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="193" data-original-width="509" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyM1_gH28EgIwbMZ6YIP6TjP3rHBlJAs9NeKxK36ZPqtg48aGCFA62HFbJVft6PcECslPIQwtCtGfvteM84Ofg8CJt9cyGTXAygV3kydFk-GB52JEr1RcOYY_jliRzixP-Xq73Wz6lgS0/s16000/ADF_Child_Pipeline.jpg" /></a></div><div><br /></div><div><br /></div>The problem here is ,when we execute the parent pipeline and something fails in say the ForEach loop in the child pipeline, ADF doesn't call out the actual error from child pipeline in the parent pipeline and only gives out the following error messages, which is very generic.</div><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7ys_imE3zLIlI7DPLyjnyFpB2AFrDefKx_awpMUFe01na2SSkoBcV3GqJqWE-WRi6g2-Wdf1QlT7CO3zqSfCyNkFRfIORqtceRO5Jp6DrB9buEiEQ-Eb4aiYx5bjUSKeKcbqh04mRaO4/s968/ADF_Parent_Pipeline_error.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="275" data-original-width="968" height="181" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7ys_imE3zLIlI7DPLyjnyFpB2AFrDefKx_awpMUFe01na2SSkoBcV3GqJqWE-WRi6g2-Wdf1QlT7CO3zqSfCyNkFRfIORqtceRO5Jp6DrB9buEiEQ-Eb4aiYx5bjUSKeKcbqh04mRaO4/w640-h181/ADF_Parent_Pipeline_error.jpg" width="640" /></a></div><div><br /></div><div><br /></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgxF347LLcLHQRMj9Hcg9DAnvRNLXEFFGZE7SFuvh19AZu2pL6OxJd63D56MhbxsqA9IhTMxV5Ii0PYFfenP3sQEhcrz4wAx3GQvGzYB62MVRQlqvAxauM4hE4GtLzohqc1CEZIAtrcFI/s469/ADF_Generic_error.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="211" data-original-width="469" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgxF347LLcLHQRMj9Hcg9DAnvRNLXEFFGZE7SFuvh19AZu2pL6OxJd63D56MhbxsqA9IhTMxV5Ii0PYFfenP3sQEhcrz4wAx3GQvGzYB62MVRQlqvAxauM4hE4GtLzohqc1CEZIAtrcFI/s16000/ADF_Generic_error.jpg" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div>Now if you see the error above it is very generic and says something failed in the For Each loop in the child pipeline. <div><br /></div><div><b>Solution:</b> The solution is simple, just to go the execute pipeline error step, click on output as shown below.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjCUKgfPo36G0wlU9ag1f0aGkAwMSvXrwo4pN0jHkE4xEiMaS8NfdegIAcvPjoIPy7jQswKuM3d5O6A-P7rHsUnwn8X0Ba5fHyRt9A5kyg4eAU9uJnknXmu169Z38wecmuaXuk8WMX5fg/s712/ADF_ExecutePipeline_output_error_log.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="85" data-original-width="712" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjCUKgfPo36G0wlU9ag1f0aGkAwMSvXrwo4pN0jHkE4xEiMaS8NfdegIAcvPjoIPy7jQswKuM3d5O6A-P7rHsUnwn8X0Ba5fHyRt9A5kyg4eAU9uJnknXmu169Z38wecmuaXuk8WMX5fg/s16000/ADF_ExecutePipeline_output_error_log.jpg" /></a></div><br /><div><br /></div><div>When you click on output, the below popup box appears, click on pipelineRunid which takes you to the child pipelin.e</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhg-vkrhMZGDBhzKgbyzInv_MzDkeQocNch2MUrQwfr_7mgDMY32l2cZLSjwPt47fd__KzSMzKqNkxArE4OPVWamfKgL3gqwM181xX1Go6i-fJtk_QeiqQihKRM397Ae0dee_poXbGrgpo/s457/ADF_ExecutePipeline_output_error_log_runid.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="186" data-original-width="457" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhg-vkrhMZGDBhzKgbyzInv_MzDkeQocNch2MUrQwfr_7mgDMY32l2cZLSjwPt47fd__KzSMzKqNkxArE4OPVWamfKgL3gqwM181xX1Go6i-fJtk_QeiqQihKRM397Ae0dee_poXbGrgpo/s16000/ADF_ExecutePipeline_output_error_log_runid.jpg" /></a></div><br /><div><br /></div><div><br /></div><div>Once inside the child pipeline, click anywhere in the workspace and go to output tab, you should be able to see the failure logs as below.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2lOSDHoCn9SazMwH1i4nrcj_udPUpXzCGQd0Df7uUBz0h6LK88PwnuPt3f3PyNjiDksNxT27VDJiyj8yl_U2Tfaz9whhY4Sf3SlrbEfVGSzEjTnqESlNZQnLSoN7_gVd5Nf8ihKgxqPA/s904/ADF_Child_pipeline_execute_pipeline_error_logs.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="382" data-original-width="904" height="169" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2lOSDHoCn9SazMwH1i4nrcj_udPUpXzCGQd0Df7uUBz0h6LK88PwnuPt3f3PyNjiDksNxT27VDJiyj8yl_U2Tfaz9whhY4Sf3SlrbEfVGSzEjTnqESlNZQnLSoN7_gVd5Nf8ihKgxqPA/w400-h169/ADF_Child_pipeline_execute_pipeline_error_logs.jpg" width="400" /></a></div><br /><div><br /></div><div><br /></div><div>Hope this helped you. </div><div><br /></div><div><div><br /></div></div>Athul Jayachandranhttp://www.blogger.com/profile/15977854221184512027noreply@blogger.com0