Scan Class — Scan

Scan

class aerospike.Scan

The Scan object is used to return all the records in a specified set (which can be ommitted or None). A Scan with a None set returns all the records in the namespace.

The scan is invoked using either foreach() or results(). The bins returned can be filtered using select().

See also

Scans and Managing Scans.

select(bin1[, bin2[, bin3..]])

Set a filter on the record bins resulting from results() or foreach(). If a selected bin does not exist in a record it will not appear in the bins portion of that record tuple.

results([policy]) -> list of (key, meta, bins)

Buffer the records resulting from the scan, and return them as a list of records.

Parameters:policy (dict) – optional Scan Policies.
Returns:a list of Record Tuple.
import aerospike
import pprint

pp = pprint.PrettyPrinter(indent=2)
config = { 'hosts': [ ('127.0.0.1',3000)]}
client = aerospike.client(config).connect()

client.put(('test','test','key1'), {'id':1,'a':1},
    policy={'key':aerospike.POLICY_KEY_SEND})
client.put(('test','test','key2'), {'id':2,'b':2},
    policy={'key':aerospike.POLICY_KEY_SEND})

scan = client.scan('test', 'test')
scan.select('id','a','zzz')
res = scan.results()
pp.pprint(res)
client.close()

Note

We expect to see:

[ ( ( 'test',
      'test',
      u'key2',
      bytearray(b'\xb2\x18\n\xd4\xce\xd8\xba:\x96s\xf5\x9ba\xf1j\xa7t\xeem\x01')),
    { 'gen': 52, 'ttl': 2592000},
    { 'id': 2}),
  ( ( 'test',
      'test',
      u'key1',
      bytearray(b'\x1cJ\xce\xa7\xd4Vj\xef+\xdf@W\xa5\xd8o\x8d:\xc9\xf4\xde')),
    { 'gen': 52, 'ttl': 2592000},
    { 'a': 1, 'id': 1})]
foreach(callback[, policy[, options]])

Invoke the callback function for each of the records streaming back from the scan.

Parameters:

Note

A Record Tuple is passed as the argument to the callback function.

import aerospike
import pprint

pp = pprint.PrettyPrinter(indent=2)
config = { 'hosts': [ ('127.0.0.1',3000)]}
client = aerospike.client(config).connect()

client.put(('test','test','key1'), {'id':1,'a':1},
    policy={'key':aerospike.POLICY_KEY_SEND})
client.put(('test','test','key2'), {'id':2,'b':2},
    policy={'key':aerospike.POLICY_KEY_SEND})

def show_key((key, meta, bins)):
    print(key)

scan = client.scan('test', 'test')
scan_opts = {
  'concurrent': True,
  'nobins': True,
  'priority': aerospike.SCAN_PRIORITY_MEDIUM
}
scan.foreach(show_key, options=scan_opts)
client.close()

Note

We expect to see:

('test', 'test', u'key2', bytearray(b'\xb2\x18\n\xd4\xce\xd8\xba:\x96s\xf5\x9ba\xf1j\xa7t\xeem\x01'))
('test', 'test', u'key1', bytearray(b'\x1cJ\xce\xa7\xd4Vj\xef+\xdf@W\xa5\xd8o\x8d:\xc9\xf4\xde'))

Note

To stop the stream return False from the callback function.

from __future__ import print_function
import aerospike

config = { 'hosts': [ ('127.0.0.1',3000)]}
client = aerospike.client(config).connect()

def limit(lim, result):
    c = [0] # integers are immutable so a list (mutable) is used for the counter
    def key_add((key, metadata, bins)):
        if c[0] < lim:
            result.append(key)
            c[0] = c[0] + 1
        else:
            return False
    return key_add

scan = client.scan('test','user')
keys = []
scan.foreach(limit(100, keys))
print(len(keys)) # this will be 100 if the number of matching records > 100
client.close()

Scan Policies

policy

A dict of optional scan policies which are applicable to Scan.results() and Scan.foreach(). See Policies.

  • timeout maximum time in milliseconds to wait for the operation to complete. Default 0 means do not timeout.
  • fail_on_cluster_change bool whether to fail the scan if a change occurs on the cluster. Default True.
  • socket_timeout Maximum time in milliseconds for server side socket timeout. 0 means there is no socket timeout. Default 10000. Added in version 2.0.11.

Scan Options

options

A dict of optional scan options which are applicable to Scan.foreach().

  • priority See Scan Constants for values. Default aerospike.SCAN_PRIORITY_AUTO.
  • nobins bool whether to return the bins portion of the Record Tuple. Default False.
  • concurrent bool whether to run the scan concurrently on all nodes of the cluster. Default False.
  • include_ldt bool whether to include LDT bins with the scan. Default False.
  • percent int percentage of records to return from the scan. Default 100.

New in version 1.0.39.