Panduan Cepat untuk Apache Geode

1. Ikhtisar

Apache Geode adalah jaringan data dalam memori terdistribusi yang mendukung caching dan komputasi data.

Dalam tutorial ini, kita akan membahas konsep utama Geode dan menjalankan beberapa contoh kode menggunakan klien Java-nya.

2. Penyiapan

Pertama, kita perlu mengunduh dan menginstal Apache Geode dan mengatur lingkungan gfsh . Untuk melakukan ini, kita dapat mengikuti petunjuk di panduan resmi Geode.

Dan kedua, tutorial ini akan membuat beberapa artefak sistem file. Jadi, kami dapat mengisolasi mereka dengan membuat direktori sementara dan meluncurkan berbagai hal dari sana.

2.1. Instalasi dan Konfigurasi

Dari direktori sementara kami, kami perlu memulai instance Locator :

gfsh> start locator --name=locator --bind-address=localhost

Locator bertanggung jawab atas koordinasi antara anggota Geode Cluster yang berbeda , yang selanjutnya dapat kami kelola melalui JMX.

Selanjutnya, mari kita mulai instance Server untuk menghosting satu atau beberapa Wilayah data :

gfsh> start server --name=server1 --server-port=0

Kami menyetel opsi –server-port ke 0 sehingga Geode akan memilih port yang tersedia. Meskipun jika kita membiarkannya, server akan menggunakan port default 40404. Server adalah anggota Cluster yang dapat dikonfigurasi yang berjalan sebagai proses berumur panjang dan bertanggung jawab untuk mengelola Wilayah data .

Dan akhirnya, kami membutuhkan Wilayah :

gfsh> create region --name=baeldung --type=REPLICATE

The Region akhirnya di mana kita akan menyimpan data kami.

2.2. Verifikasi

Mari kita pastikan semuanya berfungsi sebelum kita melangkah lebih jauh.

Pertama, mari kita periksa apakah kita memiliki Server dan Locator kita :

gfsh> list members Name | Id ------- | ---------------------------------------------------------- server1 | 192.168.0.105(server1:6119):1024 locator | 127.0.0.1(locator:5996:locator):1024 [Coordinator]

Dan selanjutnya, kami memiliki Wilayah kami :

gfsh> describe region --name=baeldung .......................................................... Name : baeldung Data Policy : replicate Hosting Members : server1 Non-Default Attributes Shared By Hosting Members Type | Name | Value ------ | ----------- | --------------- Region | data-policy | REPLICATE | size | 0 | scope | distributed-ack

Juga, kita harus memiliki beberapa direktori pada sistem file di bawah direktori sementara kita yang disebut "locator" dan "server1".

Dengan hasil ini, kita tahu bahwa kita siap untuk melanjutkan.

3. Ketergantungan Maven

Sekarang kita memiliki Geode yang sedang berjalan, mari kita mulai melihat kode klien.

Untuk bekerja dengan Geode di kode Java kita, kita perlu menambahkan pustaka klien Apache Geode Java ke pom kita :

 org.apache.geode geode-core 1.6.0 

Mari kita mulai dengan menyimpan dan mengambil beberapa data di beberapa wilayah.

4. Penyimpanan dan Pengambilan Sederhana

Mari kita tunjukkan cara menyimpan nilai tunggal, kumpulan nilai serta objek khusus.

Untuk mulai menyimpan data di wilayah "baeldung" kita, mari hubungkan dengan menggunakan pencari lokasi:

@Before public void connect() { this.cache = new ClientCacheFactory() .addPoolLocator("localhost", 10334) .create(); this.region = cache. createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY) .create("baeldung"); }

4.1. Menyimpan Nilai Tunggal

Sekarang, kita cukup menyimpan dan mengambil data di wilayah kita:

@Test public void whenSendMessageToRegion_thenMessageSavedSuccessfully() { this.region.put("A", "Hello"); this.region.put("B", "Baeldung"); assertEquals("Hello", region.get("A")); assertEquals("Baeldung", region.get("B")); }

4.2. Menyimpan Banyak Nilai Sekaligus

Kami juga dapat menyimpan banyak nilai sekaligus, katakanlah saat mencoba mengurangi latensi jaringan:

@Test public void whenPutMultipleValuesAtOnce_thenValuesSavedSuccessfully() { Supplier
    
      keys = () -> Stream.of("A", "B", "C", "D", "E"); Map values = keys.get() .collect(Collectors.toMap(Function.identity(), String::toLowerCase)); this.region.putAll(values); keys.get() .forEach(k -> assertEquals(k.toLowerCase(), this.region.get(k))); }
    

4.3. Menyimpan Objek Kustom

String berguna, tetapi lebih cepat daripada nanti kita harus menyimpan objek khusus.

Mari kita bayangkan bahwa kita memiliki catatan pelanggan yang ingin kita simpan menggunakan jenis kunci berikut:

public class CustomerKey implements Serializable { private long id; private String country; // getters and setters // equals and hashcode }

Dan tipe nilai berikut:

public class Customer implements Serializable { private CustomerKey key; private String firstName; private String lastName; private Integer age; // getters and setters }

Ada beberapa langkah tambahan untuk dapat menyimpan ini:

First, they should implement Serializable. While this isn't a strict requirement, by making them Serializable, Geode can store them more robustly.

Second, they need to be on our application's classpath as well as the classpath of our Geode Server.

To get them to the server's classpath, let's package them up, say using mvn clean package.

And then we can reference the resulting jar in a new start server command:

gfsh> stop server --name=server1 gfsh> start server --name=server1 --classpath=../lib/apache-geode-1.0-SNAPSHOT.jar --server-port=0

Again, we have to run these commands from the temporary directory.

Finally, let's create a new Region named “baeldung-customers” on the Server using the same command we used for creating the “baeldung” region:

gfsh> create region --name=baeldung-customers --type=REPLICATE

In the code, we'll reach out to the locator as before, specifying the custom type:

@Before public void connect() { // ... connect through the locator this.customerRegion = this.cache. createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY) .create("baeldung-customers"); }

And, then, we can store our customer as before:

@Test public void whenPutCustomKey_thenValuesSavedSuccessfully() { CustomerKey key = new CustomerKey(123); Customer customer = new Customer(key, "William", "Russell", 35); this.customerRegion.put(key, customer); Customer storedCustomer = this.customerRegion.get(key); assertEquals("William", storedCustomer.getFirstName()); assertEquals("Russell", storedCustomer.getLastName()); }

5. Region Types

For most environments, we'll have more than one copy or more than one partition of our region, depending on our read and write throughput requirements.

So far, we've used in-memory replicated regions. Let's take a closer look.

5.1. Replicated Region

As the name suggests, a Replicated Region maintains copies of its data on more than one Server. Let's test this.

From the gfsh console in the working directory, let's add one more Server named server2 to the cluster:

gfsh> start server --name=server2 --classpath=../lib/apache-geode-1.0-SNAPSHOT.jar --server-port=0

Remember that when we made “baeldung”, we used –type=REPLICATE. Because of this, Geode will automatically replicate our data to the new server.

Let's verify this by stopping server1:

gfsh> stop server --name=server1

And, then let's execute a quick query on the “baeldung” region.

If the data was replicated successfully, we'll get results back:

gfsh> query --query='select e.key from /baeldung.entries e' Result : true Limit : 100 Rows : 5 Result ------ C B A E D

So, it looks like the replication succeeded!

Adding a replica to our region improves data availability. And, because more than one server can respond to queries, we'll get higher read throughput as well.

But, what if they both crash? Since these are in-memory regions, the data will be lost.For this, we can instead use –type=REPLICATE_PERSISTENT which also stores the data on disk while replicating.

5.2. Partitioned Region

With larger datasets, we can better scale the system by configuring Geode to split a region up into separate partitions, or buckets.

Let's create one partitioned Region named “baeldung-partitioned”:

gfsh> create region --name=baeldung-partitioned --type=PARTITION

Add some data:

gfsh> put --region=baeldung-partitioned --key="1" --value="one" gfsh> put --region=baeldung-partitioned --key="2" --value="two" gfsh> put --region=baeldung-partitioned --key="3" --value="three"

And quickly verify:

gfsh> query --query='select e.key, e.value from /baeldung-partitioned.entries e' Result : true Limit : 100 Rows : 3 key | value --- | ----- 2 | two 1 | one 3 | three

Then, to validate that the data got partitioned, let's stop server1 again and re-query:

gfsh> stop server --name=server1 gfsh> query --query='select e.key, e.value from /baeldung-partitioned.entries e' Result : true Limit : 100 Rows : 1 key | value --- | ----- 2 | two

We only got some of the data entries back this time because that server only has one partition of the data, so when server1 dropped, its data was lost.

But what if we need both partitioning and redundancy? Geode also supports a number of other types. The following three are handy:

  • PARTITION_REDUNDANT partitions and replicates our data across different members of the cluster
  • PARTITION_PERSISTENT partitions the data like PARTITION, but to disk, and
  • PARTITION_REDUNDANT_PERSISTENT gives us all three behaviors.

6. Object Query Language

Geode also supports Object Query Language, or OQL, which can be more powerful than a simple key lookup. It's a bit like SQL.

For this example, let's use the “baeldung-customer” region we built earlier.

If we add a couple more customers:

Map data = new HashMap(); data.put(new CustomerKey(1), new Customer("Gheorge", "Manuc", 36)); data.put(new CustomerKey(2), new Customer("Allan", "McDowell", 43)); this.customerRegion.putAll(data);

Then we can use QueryService to find customers whose first name is “Allan”:

QueryService queryService = this.cache.getQueryService(); String query = "select * from /baeldung-customers c where c.firstName = 'Allan'"; SelectResults results = (SelectResults) queryService.newQuery(query).execute(); assertEquals(1, results.size());

7. Function

One of the more powerful notions of in-memory data grids is the idea of “taking the computations to the data”.

Simply put, since Geode is pure Java, it's easy for us to not only send data but also logic to perform on that data.

This might remind us of the idea of SQL extensions like PL-SQL or Transact-SQL.

7.1. Defining a Function

To define a unit of work for Geode to do,we implement Geode's Function interface.

For example, let's imagine we need to change all the customer's names to upper case.

Instead of querying the data and having our application do the work, we can just implement Function:

public class UpperCaseNames implements Function { @Override public void execute(FunctionContext context) { RegionFunctionContext regionContext = (RegionFunctionContext) context; Region region = regionContext.getDataSet(); for ( Map.Entry entry : region.entrySet() ) { Customer customer = entry.getValue(); customer.setFirstName(customer.getFirstName().toUpperCase()); } context.getResultSender().lastResult(true); } @Override public String getId() { return getClass().getName(); } }

Note that getId must return a unique value, so the class name is typically a good pick.

The FunctionContext contains all our region data, and so we can do a more sophisticated query out of it, or, as we've done here, mutate it.

And Function has plenty more power than this, so check out the official manual, especially the getResultSender method.

7.2. Deploying Function

We need to make Geode aware of our function to be able to run it. Like we did with our custom data types, we'll package the jar.

But this time, we can just use the deploy command:

gfsh> deploy --jar=./lib/apache-geode-1.0-SNAPSHOT.jar

7.3. Executing Function

Now, we can execute the Function from the application using the FunctionService:

@Test public void whenExecuteUppercaseNames_thenCustomerNamesAreUppercased() { Execution execution = FunctionService.onRegion(this.customerRegion); execution.execute(UpperCaseNames.class.getName()); Customer customer = this.customerRegion.get(new CustomerKey(1)); assertEquals("GHEORGE", customer.getFirstName()); }

8. Kesimpulan

Pada artikel ini, kami mempelajari konsep dasar ekosistem Apache Geode . Kami melihat pengambilan dan penempatan sederhana dengan tipe standar dan kustom, wilayah yang direplikasi dan dipartisi, serta dukungan oql dan fungsi.

Dan seperti biasa, semua sampel ini tersedia di GitHub.