About data propagation time

by | Mar 13, 2018

I prefer to tell you right now, if you do not like technical discussions, you can forget this article: it is absolutely not essential to the use of Weather Station.
On the other hand, if you like to understand how are working the tools you use and what are the possibilities and limitations of Weather Station, you can read this article to the end. Or not 🙂

Many of you asked me, “Why is not Weather Station real-time?“. Even, having proposed as an improvement: “Allow a real time display of the data measured by my weather station.»… I owe you an explanation of how weather data collection works with Weather Station. But before describing how this works, and how the data appear in your WordPress page, I would like to specify, the time of a few lines, what is real time.

Real time?

In metrology, we speak of a real-time system when the collection, manipulation and display system has an operating speed adapted to the evolution speed of the made measurements. For meteorology, moreover when adding noise or wind data, for example, the sampling frequency is of the order of hertz (one measurement per second). This is generally the case with consumer equipment (the new WeatherFlow weather stations, for example, have a wind sampling frequency of 1 Hz and a 1/3 Hz averaging). Even if this raises real questions about the interest and the operating capabilities of such a frequency in the context of “classical” meteorology, the fact is that, from a hardware perspective, we can now measure large quantities of environmental values in a very limited time.
Let us do agree, I’m talking about meteorology, not a management and decision-making tool, as an anemometer can be when you’re helping out the deck-landing of an F16 or when you’re practicing the regatta… Beyond the knowledge of our environment at a given moment, the main reason for measuring physical meteorological quantities is to feed forecasting models. Models that have – to my knowledge – no ability (and no “interest”) to manipulate data sampled at the previously mentioned frequencies.
If you asked me today what sampling rate is useful and reasonable in current weather, I would certainly answer you “between 1/300 Hz and 1/600 Hz” (readings every 5 or 10 minutes). Nevertheless, I can understand that this opinion may be nuanced for other types of applications and that some users of Weather Station may have other needs than mine.

That’s why Weather Station makes the most of what it can do, with only technical and common sense limitations. Explanations…

From measuring to WordPress…

Before appearing on your WordPress page or post, data from a weather station goes through a fairly large amount of intermediates and treatments.

For example, consider a WeatherFlow Smart Weather Station (example valid for other types of stations such as the personal Netatmo station) and wind speed and direction measurement (which is the measurement subject of the highest sampling frequency):
The SKY module takes a measurement every second and averages these readings every three seconds (you will find, in this connection, a very interesting discussion on the use of V3, V60 and V3.60 standards).
This average is then sent (every 3 seconds, so) to the HUB. If the radio conditions are optimal, let’s say that the HUB has now acquired a record that is not older than a second.
The HUB then sends this data (at a lower frequency I think, but for the beauty of the explanation let’s assume that it is without delay) to the WeatherFlow infrastructures (in the marketing language, it’s called the WeatherFlow cloud). After receiving this data, the WeatherFlow infrastructures will make it available to APIs in 1 minute.
We have already 60+ε seconds of delay on the actual measurement (ε being the time taken by the SKY ⇢ HUB connection).

But this explanation does not take into account an important element of the “chain of transmission”: the time taken to pass the data from the HUB to the WeatherFlow infrastructures. Indeed, this transmission goes through the wifi router, then the modem (cable, optical fiber, etc.) before passing through the Internet (and there, we have latency, linked to the routing of packets, and perhaps also because of a degraded quality connection). Let’s estimate this time to 1 extra second. Regardless of the protocol used (MQTT, HTTP, etc.), the response and processing time of the request by the WeatherFlow servers can not be less than 2 seconds.
Thus, before this data is available to be consumed via the WeatherFlow APIs, it has elapsed at best 60+4ε seconds and mostly not more than ten minutes.

This example is of course the best case scenario: in the case of a station that publish directly on WeatherUnderground, the minimum provisioning time on the APIs would be at least 2 times higher because of the limitations in the sending frequencies to this service (in the case of WeatherFlow, I made the assumption this limitation does not exist, which is highly unlikely).

Let’s see now the second phase of the transmission of this data…
Once this data is available via the APIs, Weather Station will collect it. For this, it uses the built-in WordPress task scheduler that allows to execute an action every X minutes. Here, in fast mode, Weather Station will poll the APIs to get the latest available data every 2 minutes (every 5 minutes in standard mode). The obtained data will therefore be 60+4ε seconds old at least, and 60+4ε + 120-ε seconds at most. Added to this is the HTTP connection time (at least 1 second), and the data processing and storage time in the WordPress database (at least 1 second).

The data stored in the database will therefore be a 60+6ε seconds old data at least and 180+5ε seconds at most.

Note that if we take our calculation with ε = 1 second, we have at best a 66 seconds old data. And at this point, the data is still not displayed in your page made with love: it is just in the database of your WordPress site.
To display it on the page, you can use text values, gauges, and so on. These elements are refreshed every two minutes (that is, will actually read the data in the database every two minutes). In the best case, it therefore add ε seconds more compared to the actual measurement, and in the worst case 120-ε seconds.

So, if everything went well in this chain of transmission, the age of the measurement displayed in a Weather Station control is framed by the following two values: 67 s <  t  < 304 s

Conclusion

As you can see, the average delay, in the best case scenario, is in the order of 3 minutes, which is consistent with what I find reasonable, but is far from real time …
We could, of course, slightly reduce this time by increasing the collection frequency, but we would quickly reach the limit of what WordPress is able to do (a task scheduler with an internal resolution expressed in minute) and this could seriously degrade the performance of the site for which this would be required.

Thus, as long as Weather Station uses the APIs of manufacturers or weather data aggregation services, it will stay well away from real time…

2 Comments

  1. meteolliria

    Perfectly explained and thanks for writing it.

    Reply
    • Pierre Lannoy

      Thanks, hope it helps to understand… It was my goal 🙂

      Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Similar Articles

[jetpack-related-posts]

Stay Up to Date

Receive the latest news and updates from Weather Station.