Mongoid Iterating Over Large Collections no_timeout use no_timeout 使用

Aug 11th, 2018

: Mongod

當我們要找尋所有店家會使用Merchant.each，但當數量過大就不太適合了，這時候會需要使用到batch_size。
但是在Mongoid中Model.all會回傳Mongoid::Criteria的instance，而且會調用Criteria上的#each，在此將會實例化Mongo driver cursor使其紀錄。而這個底層的Mongo driver cursor已經批量的處理所有紀錄，而默認情況下batch_size: 預設為 100。

batch_size = 100
Merchant.each_with_index do |merchant, index|
  if (index % batch_size).zero?
    sleep 1
  end
end

對Mongoid而言，可以直接用Merchant.each，會自動利用(cursor)分批加載。不過有個問題就是cursor有個10分鐘超時限制。意思是超過10分鐘就危險了。中途可能會發生no cursor的錯誤。

Mongo::Error::OperationFailure:
  Cursor not found, cursor id: 79727049273 (43)

接下來我們可以使用Handling no cursor error。

Model.no_timeout.each
# OR use no_cursor_timeout

`no_timeout`: 實際上在所有查詢默認情況下都有一個超時(默認值為60秒)。而你可以設置 `no_timeout` 來告訴他不timeout。

`no_cursor_timeout`: 當數據量過大較需的時間較長時(find 可能會執行較久)，就有可能發生`Cursor not found`的錯誤因此我們需要設置游標不過期，如果還有發生`Cursor not found`問題請把`batch_size`開大點。

補充: 當我們在 Mongodb使用query查詢時會返回cursor類型(實際上也就是Iterator 模式)的實作，而cursor有個有個方法是explain()，用途是提供query plan的訊息，而可能的模式有 queryPlanner (default), executionStats, allPlansExecution。如果我們要看的話通常會看winningPlan和rejectedPlans去查看內容物(ex: index…)。

Greg Yang

Developer

Mongoid Iterating Over Large Collections no_timeout use no_timeout 使用

`no_timeout`: 實際上在所有查詢默認情況下都有一個超時(默認值為60秒)。而你可以設置 `no_timeout` 來告訴他不timeout。

`no_cursor_timeout`: 當數據量過大較需的時間較長時(find 可能會執行較久)，就有可能發生`Cursor not found`的錯誤因此我們需要設置游標不過期，如果還有發生`Cursor not found`問題請把`batch_size`開大點。

如果是執行時間請查看`executionStats.executionTimeMillis`。

參考資源

Greg Yang

Developer

Mongoid Iterating Over Large Collections no_timeout use no_timeout 使用

no_timeout: 實際上在所有查詢默認情況下都有一個超時(默認值為60秒)。而你可以設置 no_timeout 來告訴他不timeout。

no_cursor_timeout: 當數據量過大較需的時間較長時(find 可能會執行較久)，就有可能發生Cursor not found的錯誤因此我們需要設置游標不過期，如果還有發生Cursor not found問題請把batch_size開大點。

如果是執行時間請查看executionStats.executionTimeMillis。

參考資源

Related Posts

Github Hooks auto add commit message 25 Nov 2024

Apple Pay 10 May 2024

Android Project folder Structure 10 May 2024

`no_timeout`: 實際上在所有查詢默認情況下都有一個超時(默認值為60秒)。而你可以設置 `no_timeout` 來告訴他不timeout。

`no_cursor_timeout`: 當數據量過大較需的時間較長時(find 可能會執行較久)，就有可能發生`Cursor not found`的錯誤因此我們需要設置游標不過期，如果還有發生`Cursor not found`問題請把`batch_size`開大點。

如果是執行時間請查看`executionStats.executionTimeMillis`。