目 录CONTENT

文章目录

flush() and commit() in sqlalchemy

Administrator
2024-12-24 / 0 评论 / 0 点赞 / 13 阅读 / 0 字

the difference is between flush() and commit() in SQLAlchemy?

A Session object is basically an ongoing transaction of changes to a database (update, insert, delete). These operations aren't persisted to the database until they are committed (if your program aborts for some reason in mid-session transaction, any uncommitted changes within are lost).Session 对象基本上是对数据库的更改(更新、插入、删除)的持续事务。这些操作在提交之前不会持久化到数据库中(如果您的程序在会话中途事务中由于某种原因中止,则其中任何未提交的更改都将丢失)。

The session object registers transaction operations with session.add(), but doesn't yet communicate them to the database until session.flush() is called.会话对象使用 session.add() 注册事务操作,但在调用 session.flush() 之前不会将它们传达给数据库。

session.flush() communicates a series of operations to the database (insert, update, delete). The database maintains them as pending operations in a transaction. The changes aren't persisted permanently to disk, or visible to other transactions until the database receives a COMMIT for the current transaction (which is what session.commit() does).session.flush() 将一系列操作(INSERT、UPDATE、DELETE)传达给数据库。数据库将它们作为事务中的待处理操作进行维护。这些更改不会永久保存到磁盘上,也不会对其他事务可见,直到数据库收到当前事务的 COMMIT 为止(这就是 session.commit() 所做的)。

session.commit() commits (persists) those changes to the database.session.commit()

flush() is always called as part of a call to commit()

When you use a Session object to query the database, the query will return results both from the database and from the flushed parts of the uncommitted transaction it holds. By default, Session objects autoflush their operations, but this can be disabled.

#---
s = Session()

s.add(Foo('A')) # The Foo('A') object has been added to the session.
                # It has not been committed to the database yet,
                #   but is returned as part of a query.
print 1, s.query(Foo).all()
s.commit()

#---
s2 = Session()
s2.autoflush = False

s2.add(Foo('B'))
print 2, s2.query(Foo).all() # The Foo('B') object is *not* returned
                             #   as part of this query because it hasn't
                             #   been flushed yet.
s2.flush()                   # Now, Foo('B') is in the same state as
                             #   Foo('A') was above.
print 3, s2.query(Foo).all() 
s2.rollback()                # Foo('B') has not been committed, and rolling
                             #   back the session's transaction removes it
                             #   from the session.
print 4, s2.query(Foo).all()

#---
Output:
1 [<Foo('A')>]
2 [<Foo('A')>]
3 [<Foo('A')>, <Foo('B')>]
4 [<Foo('A')>]

获取新对象ID

如果需要获取新对象的id,则必须调用session.flush(), autoflush()不会自动填充id

# Given a model with at least this id
class AModel(Base):
   id = Column(Integer, primary_key=True)  # autoincrement by default on integer primary key

session.autoflush = True

a = AModel()
session.add(a)
a.id  # None
session.flush()
a.id  # autoincremented integer

为什么要flush()

flush

Again, here presumably the use of a flush() would ensure the desired behavior. So in summary, one use for flush is to provide order guarantees (I think), again while still allowing yourself an "undo" option that commit does not provide.

  1. 顺序提交

  2. 提供commit缺失的undo能力

autoflush

When True, all query operations will issue a Session.flush() call to this Session before proceeding. This is a convenience feature so that Session.flush() need not be called repeatedly in order for database queries to retrieve results. It’s typical that autoflush is used in conjunction with autocommit=False. In this scenario, explicit calls to Session.flush() are rarely needed; you usually only need to call Session.commit() (which flushes) to finalize changes.

内存使用

提交应该减少内存使用,尽管这里可能在内存和性能之间进行了权衡。换句话说,您可能不想一次提交一个数据库更改(出于性能原因),但等待太久会增加内存使用量

  1. 内存优化:弱引用对象被删除,则内存信息就被清理

  2. 循环引用

我来解释一下 SQLAlchemy 中 session 使用弱引用(weak reference)存储 ORM 对象的原因。

主要有以下几个关键原因:

1. 内存管理
当一个 ORM 对象不再被其他代码引用时,我们希望它能被及时垃圾回收。如果 session 使用强引用,那么只要 session 存在,这些对象就无法被回收,即使它们已经不再被使用。使用弱引用可以避免这个问题。

举个例子:
```python
session = Session()
user = User(name="test")
session.add(user)

# 如果某处代码不再使用 user 对象
del user  

# 由于 session 使用弱引用,此时 user 对象可以被垃圾回收
# 如果是强引用,即使 del user,对象也无法被回收,因为 session 还在引用它
```

2. 防止循环引用
session 和 ORM 对象之间可能存在双向引用:
- session 需要追踪 ORM 对象
- ORM 对象需要知道自己属于哪个 session  

如果都使用强引用,就会形成循环引用,造成内存泄露。使用弱引用可以打破这个循环。
```python
class User:
    def __init__(self, session):
        self.session = session  # User引用Session
        # Session也引用User (通过弱引用)
        session.add(self)
```

3. 缓存一致性
session 作为一级缓存,需要能感知到 ORM 对象的生命周期。使用弱引用可以让 session 知道对象何时可以被回收,从而保持缓存的一致性。

4. 事务管理
在长事务场景下,如果 session 强引用所有 ORM 对象,可能会占用大量内存。使用弱引用允许不再使用的对象及时释放。
power by AI
# 弱引用
import weakref

class User:
    def __init__(self, name):
        self.name = name

# 创建一个普通对象
user = User("Alice")

# 创建一个弱引用
weak_ref = weakref.ref(user)

# 通过弱引用访问对象
print(weak_ref().name)  # 输出: "Alice"

# 当原始对象被删除时
del user

# 弱引用会自动失效
print(weak_ref())  # 输出: None

警告

Note that after Session.commit() is called, either explicitly or when using a context manager, all objects associated with the Session are expired, meaning their contents are erased to be re-loaded within the next transaction. If these objects are instead detached, they will be non-functional until re-associated with a new Session, unless the Session.expire_on_commit parameter is used to disable this behavior. See the section Committing for more detail.

expired即全部对象表现为空,在commit() 之后,如果需要调用到commit前的对象,则需要显示refresh({obj})去重新加载对象

0
  1. 支付宝打赏

    qrcode alipay
  2. 微信打赏

    qrcode weixin

评论区