What happened roughly every two hours between 12 Oct and 20 Oct 2022 with mapillary-sourced features?

I downloaded changesets-230508.osm.bz2 from https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/planet/2023/:slight_smile:

curl https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/planet/2023/changesets-230508.osm.bz2 -o changesets-230508.osm.bz2

I wanted to see all the changesets mentioning mapillary. I restricted my search to changesets overlapping with the approximate bounding box of Norway:

osmium changeset-filter --bbox -10,57,34,81 --progress -f opl changesets-230508.osm.bz2 > changesets_filtered.txt
grep -i mapillary changesets_filtered.txt > changesets_mapillary.txt

(Here I’m heavily relying on this post and osmium documentation.)

To visualize how these changesets are distributed over time, I do:

import pandas as pd
import datetime
import matplotlib.pyplot as plt

df = pd.read_csv("changesets_mapillary.txt",sep=' ',header=None)
df = df.assign(ts = df[2].apply(lambda row: row[1:]))
df = df.assign(timestamps = df.ts.apply(lambda s: datetime.datetime.timestamp(pd.to_datetime(s))))

df.timestamps.hist(bins=100)
plt.axvline(x=1.66*10**9,c='r',alpha=0.8)

Giving me:

The horizontal axis is unix timestamp, the vertical axis is number of changesets. As one can see, there is a big spike slighlty after the red line.

To see how the timings of changesets are distributed over the course of a day, I do:

datetime_series = pd.to_datetime(df.timestamps, unit='s')
(datetime_series.dt.hour*60+datetime_series.dt.minute).hist(bins=int(24*60))
plt.xlim([0,24*60])

(Horizontal axis: quarter hour within a day, vertical axis: how many changesets originating from that quarter.)

There is a strong 2-hour periodic signal present. To see if this is the same signal as the spike above, I consider only the changesets before the red line from the first image, and replot this second distribution:

datetime_series = pd.to_datetime(df.timestamps[df.timestamps<1.66*10**9], unit='s')
(datetime_series.dt.hour*60+datetime_series.dt.minute).hist(bins=int(24*60))
plt.xlim([0,24*60])

Clearly, the 2-hour signal disappeared. I replot the first plot, but now I zoom to the spike:

lower = 1.66557*10**9
upper = 1.66625*10**9

plt.figure(figsize=(10,5))
df.timestamps[(df.timestamps>lower) & (df.timestamps

As expected, a seemingly periodic signal. Let’s try to fold it with a 2-hour period:

(df.timestamps[(df.timestamps>lower) & (df.timestamps

Result:

I believe this demonstrates that there is a strong 2-hour periodic signal between lower and upper. What do lower and upper and upper actually mean?

print(pd.Timestamp(lower,unit='s').strftime('%Y-%m-%d %H:%M:%S'))
print(pd.Timestamp(upper,unit='s').strftime('%Y-%m-%d %H:%M:%S'))

I get:

2022-10-12 10:20:00
2022-10-20 07:13:20

I don’t know what’s behind this observation. Hence, the question:

What happened between 12 Oct 2022 10am and 20 Oct 2022 07am, roughly every two hours, with mapillary-sourced features?

5 posts - 3 participants

Read full topic


Ce sujet de discussion accompagne la publication sur https://community.openstreetmap.org/t/what-happened-roughly-every-two-hours-between-12-oct-and-20-oct-2022-with-mapillary-sourced-features/99185