Since publishing my last blog post about Azure Data Factory including my Azure tenant’s tenantId and principalId in its commits to git and how to clean that up, I have been playing around more with git/Github and believe I may have found a solution (albeit a somewhat clunky one!) that allows me to share my work on an active ADF project.
Thanks to Paul Pietzko’s post about using mirroring to create a personal copy of a repo, a few extra steps before removing the sensitive data from the repo have allowed me to maintain a workable private Github repo for my Azure Data Factory projects while also sharing my work publicly.
First, I have my private Azure Data Factory repo – let’s call it adf-private. Following Paul’s instructions, clone it as a bare mirror using git bash.
(Make sure you launch git bash in the folder on your local machine that you want to clone the repo to.)
git clone --mirror https://github.com/<my account>/adf-private.git
Next, create a totally empty repo on GitHub. I’ve called mine adf-public. This repo will ultimately be public after we clean up the latest commit and history, but for now make it private.
Then push the mirror to the empty GitHub repo.
Note here that we are cd-ing to the adf-private repo but setting the remote url to the adf-public repo!
cd adf-private.git
git remote set-url --push origin https://github.com/<my account>/adf-public.git
git push --mirror
At this point, I was able to confirm that adf-public on Github contained all the files and history from the adf-private repo and I deleted the local clone of adf-public on my machine. Then I made a regular local clone (no mirroring!) of adf-public from Github to my machine and I stepped through my process to remove the sensitive strings from adf-public.
With the clean up complete and the changes pushed to Github, I was now able to update the settings of adf-public on Github to make it public and share my work with others. However, any new work I commit to adf-private will not be mirrored to adf-public so I will have to go through the whole process of mirroring and cleaning up the history again every time I want to share my new changes. Overall, it’s not the most straightforward solution but it does get me a bit further along the path!
Be First to Comment