Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LibCore: Single-shot timer is fired twice #3641

Open
stasoid opened this issue Feb 20, 2025 · 4 comments
Open

LibCore: Single-shot timer is fired twice #3641

stasoid opened this issue Feb 20, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@stasoid
Copy link
Contributor

stasoid commented Feb 20, 2025

There is a delay between firing a timer event and executing a handler for that event. In EventLoopImplementationUnix the situation is possible when event for a single-shot timer is posted to ThreadEventQueue, then handler for another event calls pump again (indirectly through spin_until) and the same timer fires again. I noticed that because on Windows it results in infinite spinning in spin_until called from TraversableNavigable::check_if_unloading_is_canceled, see commit "LibCore: Make single-shot timer objects manually reset on Windows" in #3188 for more info.

After applying this patch the command Meta/ladybird.sh run headless-browser has this output on Ubuntu 22.04:

patch
diff --git a/Libraries/LibCore/EventLoopImplementationUnix.cpp b/Libraries/LibCore/EventLoopImplementationUnix.cpp
index f323d90bf9..a1acd5b1f0 100644
--- a/Libraries/LibCore/EventLoopImplementationUnix.cpp
+++ b/Libraries/LibCore/EventLoopImplementationUnix.cpp
@@ -16,6 +16,7 @@
 #include <LibCore/Socket.h>
 #include <LibCore/System.h>
 #include <LibCore/ThreadEventQueue.h>
+#include <LibCore/Timer.h>
 #include <pthread.h>
 #include <sys/select.h>
 #include <unistd.h>
@@ -200,6 +201,9 @@ public:
             }
         }
 
+        if (zero_delay && ++fire_count >= 2)
+            dbgln("--------  Zero-delay single-shot timer fired {} times", fire_count);
+
         // FIXME: While TimerShouldFireWhenNotVisible::Yes prevents the timer callback from being
         //        called, it doesn't allow event loop to sleep since it needs to constantly check if
         //        is_visible_for_timer_purposes changed. A better solution will be to unregister a
@@ -215,6 +219,9 @@ public:
     WeakPtr<EventReceiver> owner;
     pthread_t owner_thread { 0 };
     Atomic<bool> is_being_deleted { false };
+
+    bool zero_delay = false;
+    int fire_count = 0;
 };
 
 struct ThreadData {
@@ -634,6 +641,9 @@ intptr_t EventLoopManagerUnix::register_timer(EventReceiver& object, int millise
     timer->interval = AK::Duration::from_milliseconds(milliseconds);
     timer->reload(MonotonicTime::now_coarse());
     timer->should_reload = should_reload;
+    timer->zero_delay = milliseconds == 0;
+    if (milliseconds == 0)
+        VERIFY(static_cast<Timer&>(object).is_single_shot());
     timer->fire_when_not_visible = fire_when_not_visible;
     thread_data.timeouts.schedule_absolute(timer);
     return bit_cast<intptr_t>(timer);
diff --git a/Libraries/LibWebView/EventLoop/EventLoopImplementationQt.cpp b/Libraries/LibWebView/EventLoop/EventLoopImplementationQt.cpp
index 18a7fb6bd7..5de3bdc7bc 100644
--- a/Libraries/LibWebView/EventLoop/EventLoopImplementationQt.cpp
+++ b/Libraries/LibWebView/EventLoop/EventLoopImplementationQt.cpp
@@ -198,6 +198,7 @@ int EventLoopImplementationQt::exec()
 
 size_t EventLoopImplementationQt::pump(PumpMode mode)
 {
+    dbgln("EventLoopImplementationQt::pump");
     auto result = Core::ThreadEventQueue::current().process();
     auto qt_mode = mode == PumpMode::WaitForEvents ? QEventLoop::WaitForMoreEvents : QEventLoop::AllEvents;
     if (is_main_loop())
diff --git a/Services/WebContent/main.cpp b/Services/WebContent/main.cpp
index 6f10b26c72..ffae45d2ae 100644
--- a/Services/WebContent/main.cpp
+++ b/Services/WebContent/main.cpp
@@ -35,6 +35,7 @@
 #include <WebContent/PageClient.h>
 #include <WebContent/WebDriverConnection.h>
 
+#undef HAVE_QT
 #if defined(HAVE_QT)
 #    include <LibWebView/EventLoop/EventLoopImplementationQt.h>
 #    include <QCoreApplication>
Taking screenshot after 1 seconds
70786.281 WebContent(326424): --------  Zero-delay single-shot timer fired 2 times
70786.281 WebContent(326424): --------  Zero-delay single-shot timer fired 2 times
Saving screenshot to output.png
 326415 Lost connection to RequestServer
70787.301 WebContent(326424): Destroying Thread ""(140036729079360) while it is still running undetached!

I propose this fix: add a state "awaiting execution" to the timer, which is set to true when it fired, set to false when it is executed, and don't fire again if "awaiting execution" is true. Change Web::HTML::EventLoop::schedule to create a new timer each time it is called instead of using m_system_event_loop_timer.

@ADKaster ADKaster added the bug Something isn't working label Feb 20, 2025
@ADKaster
Copy link
Member

cc @DanShaders , is this an artifact of your Timer refactor from a while back?

Context from the fix in 3188
The spin_until([&]{return completed_tasks == total_tasks;}) in
TraversableNavigable::check_if_unloading_is_canceled spins forever.

Cause of the bug:

check_if_unloading_is_canceled is called deferred

check_if_unloading_is_canceled creates a task:

        queue_global_task(..., [&] {
            ...
            completed_tasks++;
        }));

This task is never executed.

queue_global_task calls TaskQueue::add

void TaskQueue::add(task)
{
    m_tasks.append(task);
    m_event_loop->schedule();
}

void HTML::EventLoop::schedule()
{
    if (!m_system_event_loop_timer)
        m_system_event_loop_timer = Timer::create_single_shot(
            0, // delay
            [&] { process(); });

    if (!m_system_event_loop_timer->is_active())
        m_system_event_loop_timer->restart();
}

EventLoop::process executes one task from task queue and calls
schedule again if there are more tasks.

So task processing relies on one single-shot zero-delay timer,
m_system_event_loop_timer.

Timers and other notification events are handled by Core::EventLoop
and Core::ThreadEventQueue, these are different from HTML::EventLoop
and HTML::TaskQueue mentioned above.

check_if_unloading_is_canceled is called using deferred_invoke
mechanism, different from m_system_event_loop_timer,
see Navigable::navigate and Core::EventLoop::deferred_invoke.

The core of the problem is that Core::EventLoop::pump is called again
(from spin_until) after timer fired but before its handler is executed.

In ThreadEventQueue::process events are moved into local variable before
executing. The first of those events is check_if_unloading_is_canceled.
One of the rest events is Web::HTML::EventLoop::process sheduled in
EventLoop::schedule using m_system_event_loop_timer.
When check_if_unloading_is_canceled calls queue_global_task its
m_system_event_loop_timer is still active because Timer::timer_event
was not yet called, so the timer is not restarted.
But Timer::timer_event (and hence EventLoop::process) will never execute
because check_if_unloading_is_canceled calls spin_until after
queue_global_task, and EventLoop::process is no longer in
event_queue.m_private->queued_events.

By making a single-shot timer manually-reset we are allowing it to fire
several times. So when spin_until is executed m_system_event_loop_timer
is fired again. Not an ideal solution, but this is the best I could
come up with. This commit makes the behavior match EventLoopImplUnix,
in which single-shot timer can also fire several times.

Adding event_queue.process(); at the start of pump like in EvtLoopImplQt
doesn't fix the problem.

Note: Timer::start calls EventReceiver::start_timer, which calls
EventLoop::register_timer with should_reload always set to true
(single-shot vs periodic are handled in Timer::timer_event instead),
so I use static_cast<Timer&>(object).is_single_shot() instead of
!should_reload.

@ADKaster
Copy link
Member

ADKaster commented Feb 20, 2025

Possibly-relevant original commits:

Switch from select to poll: SerenityOS/serenity@6836091 (SerenityOS/serenity#23046)

Timer refactor: SerenityOS/serenity@120d6b2 (SerenityOS/serenity#23149)

@ADKaster
Copy link
Member

The core of the problem is that Core::EventLoop::pump is called again
(from spin_until) after timer fired but before its handler is executed.

This ... feels unfortunate. Calling spin_until from an event callback. Outside of fixing this specific double-firing bug for single-shot timers, we should think about how to avoid doing that. Maybe it can't be avoided, maybe importing the coroutine stuff from Serenity would help serialize things, etc.

@DanShaders
Copy link
Contributor

DanShaders commented Feb 20, 2025

Oh yeah, thanks for linking all commits from that refactor

(My brain is currently unable to understand write-up you linked, try again later, i. e. I'll look into this tomorrow)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants