Along with that, it provides jobs, a catalog, and transformation management under one roof.
09.08.2025 11:00 β π 0 π 0 π¬ 0 π 0@hardikdg.bsky.social
Always learning new technologies and trends | Exploring and sharing Data domain learning and expr.
Along with that, it provides jobs, a catalog, and transformation management under one roof.
09.08.2025 11:00 β π 0 π 0 π¬ 0 π 0Using AWS for the Data pipelines and ETL and needing more granular access controls?
AWS Lake Formation is the key.
AWS Lake Formation offers fine-grained access control for your lakehouse. Control access by IAM role, column, or row. One lake, multiple secure users.
No matter how advanced AI becomes, even the best LLM canβt lift for you.
Strength isnβt just physical, itβs mental resilience too.
Train both. don't just train systems
Donβt just automate work. Automate recovery.
Sleep routines
Movement habits
Tech-free time
Code your life like a system.
Important rule for metrics trust:
Your business KPIs deserve tested logic.
Use tests like not_null, unique, and accepted_values.
One unnoticed error can break the KPIs and trust. Prevention is better than cure
Learning for analytics tables:
- Denormalize where it counts.
- For wide reporting tables, fewer joins = faster dashboards.
- Use materialized views for complex joins to save BI tools from doing heavy lifting.
Building the future? You need future-proof energy.
Less sugar, more sunlight.
Less screen time, more steps.
It's not complex.
Golden rule for query speed:
Query only what you need.
Avoid loading full tables into memory.
A dashboard query scanning 1TB was reduced to 80GB with proper partitioning + column pruning.
Less is faster. Always.
Well-being is part of the strategy of your every product.
Donβt be the last to realize.
Golden rule for data ingestion pipelines:
Validate, then transform.
Add row count checks, data type assertions, and null audits before every major step.
Itβs cheaper to fail fast than fix late.
No one ever scaled a company on bad sleep, skipped meals and unhealthy routines.
Ambition is great. But sustainability is smarter.
Active communication makes dashboards and data useful; silos only make chaos when something changes
02.08.2025 14:02 β π 0 π 0 π¬ 0 π 0Golden rule for reliable dashboards:
Document, version, communicate actively.
Renaming a column without notice broke 6 dashboards overnight.
Use schema registries or contracts and never hotfix in production.
Seek excellence, not perfection.
01.08.2025 13:45 β π 0 π 0 π¬ 0 π 0Golden rule for time zones:
Store in UTC on the server side. Convert only for display on the front side.
This will save you from lots of trouble and issues afterwards
One campaign ended 10 hours early because someone used NOW() in the wrong timezone.
In DBT or SQL, normalize time at the source.
Youβre monitoring the server and the product's health.
But what about yours?
Fitness, heart rate, sleep quality, and stress levels.
Track what matters. It is what you are going to have lifelong
- AWS DMS (Data Migration Service): Once it is ready, use DMS to migrate or replicate the data on the AWS DB services, based on your use-case.
You can choose either to migrate the data or if needed you can replicate the data on the AWS for some duration before you can shut down your old DB
Planning to get the benefit of the cloud and AI on the AWS cloud from other DB providers?
These are your best friends in the journey.
- Schema conversion tool: It helps in the verification and conversion of schema and data from other DB providers to the schema supported by AWS (Aurora, RDS, etc..)
Best practices for WHERE filters:
Filtering on computed columns disables indexes.
Bad: WHERE YEAR(order_date) = 2023
Good: WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31'
Write a code that makes the process faster and provides less waiting time to the end user.
We fear AI will outthink us.
But the truth is, we're already over-fatiguing ourselves.
Rest isn't weakness, it's your upgrade cycle.
Rest, mediate, excercise and upgrade yourself
Your brain isnβt a black-box model like a big LLM.
You can understand it.
Meditate. Journal. Listen. Debug your thoughts and get better
I wish I had learned this earlier in data engineering:
βData quality > data quantity.β
Donβt just move data. Validate it. Profile it. Monitor it at every stage of data life cycle
Otherwise, you're shipping junk faster.
Prompt engineering won't fix poor posture.
Your desk setup, movements, and fitness are part of your workflow.
Fix it manually.
Golden rule for NULLs:
NULLs donβt behave like empty strings.
In SQL, col != '' ignores NULLs.
In Spark, use .na.drop() or .fill() to stay explicit.
Treat NULLs as first-class citizens β or theyβll sneak into your reports.
Golden rule for SQL
Scan less, query faster.
Use queries for normal transactional DBs.
Use partitioning in Hive, Iceberg, or BigQuery to reduce data scanned.
One job went from 20 minutes to 2 with proper partition_column BETWEEN filters.
Systems can stay awake for hours without breaks.
However, why do you sit in the same place?
Movement gives mental clarity, fitness, and multiple benefits.
Stand. Stretch. Walk. Often.
Your body is your first and most important operating system.
Donβt run it on low quality fuel, zero sleep, and 100 tabs open.
A 1% daily improvement in you is transformational.
Compound interest isn't just for money and investments.
If your product's majority of work can be done in sheets/Excel or Notion pages kind of thing, would you go for a separate one for the same for the better UI?
This is one of the common issues where products don't get enough paid users or growth. People don't see the bigger picture in this
AI and systems are designed to handle multiple tasks simultaneously, while your brain and body is made to do a single task with full focus.
Don't let multitasking slow down your progress. Do one task at a time in the best possible way, and you will be way faster than multitasking.