Re: DQL Pricing ingest/retain/query

rgarzon1 · ‎05 Jan 2024

HI

I was doing some dashboard trying to dig the data for the api management in azure and after to get a decent dashboard for query the apis and their information. i was curious about how much this will cost or cost to build that dashboard using DQL and took me almost 3 usd according to the scanned_bytes

in this dashboard with 13 querys

i am fetch the same data for each tile, so its querying the same data 13 times. ( i will be glad if there is a way to query 1 time and work that data in different tiles but in this moment idk how)

so i speculate some data to get a idea about the use cost of this dashboard

cost

so i speculate more... what about 10 days and open that dashboard 5 times

cost

what about 30 days and open the dashboard 10 times x day

cost

are those costs right ? and what about if this dashboard is used by 10 people , its like 10x the cost 8k ?

fuelled by coffee and curiosity. ☕

rgarzon1 · ‎05 Jan 2024

Btw, i am just droping the excel file in the case that someone want to help if a formula its wrong.

fuelled by coffee and curiosity. ☕

Julius_Loman · ‎08 Jan 2024

First, querying and scanning stored raw data on the dashboard will never be an effective way to build a dashboard. Not from a cost perspective and not even performance-wise. I was briefly looking at your dashboard and probably all of your tile data could be easily replaced by your log metrics. I'm assuming you have only the log data and no metrics from OneAgent or other sources.

If I got your case right, in your example you have 2GB daily log ingest, with 10 day retention of the log data. This is is briefly ~150USD yearly by list rate card. In your provided use case the issue is with the query. As you have mentioned, you are querying 20 GB (10 days of 2GB daily) of data without any sampling for each tile. That would be 13 (tiles) * 20 GB for one single dashboard view. viewing this 5x daily for a year in a team of 10 will result in a lot of data being scanned.

Certified Dynatrace Master | Alanata a.s., Slovakia, Dynatrace Master Partner

rgarzon1 · ‎10 Jan 2024

Hi Julius

I wanted to provide some additional context regarding our current metrics setup.

We have established metrics by ingesting logs and processing them through the Ingested Logs pipeline. This involves various steps, including processing, custom attribute assignments, and metrics extraction. However, we've encountered an issue with a volatile dimension. Despite our attempts to delete this dimension, it persists over time and invalidates the associated metric. We're aware that there's an internal tool to block a dimension, but that its another workaround.

Our objective is to transition to the new dashboards, aiming for a structure similar to the one used by multiple teams on Grafana. The goal is to replicate this dashboard setup without prior data processing. Unfortunately, the projection of the costs its something the client look up

fuelled by coffee and curiosity. ☕

Julius_Loman · ‎10 Jan 2024

Can you please elaborate more on your issue with dimensions and describe the issue you have run into?

Without any processing (creating metrics), this is never an effective approach. Neither performance-wise nor cost-wise. Reading 13x 20GB of data for a dashboard (what is done internally in your approach) does not seem very effective, right? Even reading just 20GB of data for a dashboard seems quite a lot - if it's not analysis in a notebook.

Certified Dynatrace Master | Alanata a.s., Slovakia, Dynatrace Master Partner

rgarzon1 · ‎10 Jan 2024

Hi Julius

when we did the processing of logs and create the metrics with it, we create a metric ej: metric.apimanagement.apiid at the beginning we set 2 dimensions, with the time we just add more and more dimensions reaching almost 10 dimensions no taking into account that this have a volatility. of 1M,

with 1 of those 10 dimensions we reach 900k in the volatility

this can be checked with this and we got there 102% throttling the metric

dsfm:server.metrics.metric_dimensions_usage:filter(not(existskey("dt.tenant.uuid"))):sort(value(auto,descending)):limit(10)

when we notice that we have a problem we delete the dimension with the high volatility, but its a change that even after 3 days (31/12/2023) we were still affected by his volatility.

we ask for upgrade the limit to 2M to fix it (it was done), and we were able to use that metric again after 6 days.

now after 12 days we check with support if the dimension still existed and we were able to validate that it no longer existed either. but it took 12 days after deleting it for it to no longer appear as part of the dimensions that affect the limit.

according to the internal dashboard , yes its not a wise in performance logic do it in that way but its something that its done directly to the data (azure) so they have more control about what its show and thought they could have something similar

fuelled by coffee and curiosity. ☕

Julius_Loman · ‎11 Jan 2024

Sure, cardinality has to be taken into account. However, briefly looking at your dashboard example I don't think you should run into the issue if the metrics are designed correctly. Having 10 or more dimensions is not something recommended anyway and likely you should have created more metrics.

Looking at data in your dashboard I'd create metrics such as:

apimanagement.volume
apimanagement.responsetime
apimanagement.statuscode (with additional dimension status code)
...

with each having dimensions:

apiid
zonas
application name

Is the cardinality of these three dimensions + metric specific dimensions such as status code really that high?

Certified Dynatrace Master | Alanata a.s., Slovakia, Dynatrace Master Partner