Data Validation and Alerting

Where are we

Section Agenda

  • Ensure quality data is flowing through the pipeline
  • Get alerts via automation

How can I ensure the data in my pipeline is quality data?


🧰 pointblank for data validation

The pointblank data quality workflow

The pointblank data quality workflow

The pointblank data quality workflow

The pointblank data quality workflow

The pointblank data quality workflow

The pointblank data quality workflow

Data validation example

Agent Interrogation

Agent Validation Report

Pointblank data validation report


Activity Time!

Activity

👉 Open the file materials/activities/activity-03_data_validation.qmd

Activity objective: Use pointblank to validate data, remove non-compliant records, and document the data.

There’s much more to pointblank

  • Create a multiagent to summarize repeated validations to monitor data quality over time.

  • Create a dynamic Data Dictionary to fully describe the data

Try the comprehensive pointblank test drive on Posit Cloud: https://posit.cloud/project/3411822

How will I know if my data has issues?

🚫

🥱 This email is non-informative

🔊 Creates noise in the inbox

🙈 Does not compel anyone to look at it

How will I know if my data has issues?

+

=

🧰 Customized, conditional emails

Project example

⚠️ There was a problem with the data validation

✅ All looks great, here’s a relevant summary

How blastula works with Connect

Key functions:

🧶 blastula::render_connect_email

✉️ blastula::attach_connect_email

blastula::suppress_scheduled_email

🧰 couple these with logic statements to send (or suppress) condition-based emails

Project example

Be aware…

⚠️ Common mistakes when creating emails

  • There is no interactive runtime for an email
  • Javascript-dependent content will generally not render when emailed – not due to a limitation of blastula, but as result of how email clients process HTML

🧰 Best practices for embedding objects an email:

  • ggplot2 output can be included in the blastula email output directly
  • Create nicely formatted tables with the gt package. (Just remember, no interactivity!)
  • If you’d like to include a rendering of a widget (e.g., a dial or info box), use the webshot2 package to take a capture of the widget and embed it as an image
  • If your email recipient wants more information or interactivity, direct them to a report or dashboard deployed to Connect

Other Alerting Approaches

Send alerts to a Slack channel or MS Teams, or via text message: https://rviews.rstudio.com/2020/06/18/how-to-have-r-notify-you/