home link https://storj.io/
블록체인 기반의 P2P 분산형 클라우드 스토리지 네트워크를 만들어 사용자가 타가 저장소 공급자에 의존하지 않고 데이터를 전송하고 공유할 수 있도록 합니다.
Executive Chairman & Interim CEO
Founder & CSO
VP of Engineering
What is Decentralized Cloud...
Tardigrade is the world’s first enterprise-grade, decentralized cloud storage service. Through decentralization, Tardigrade is more secure, more performant, more affordable, and more private by default than centralized cloud providers.What exactly is decentralized cloud storage? On the user’s end, it operates exactly the same as traditional cloud storage options like Amazon S3. But, instead of your files being stored in a big data center that’s vulnerable to outages and attacks, your information is stored on thousands of distributed Nodes all across the globe.How the decentralized cloud worksFirst off, we aren’t going to get super technical here. This is an overview of how it works, so if you really want to dig into the technical specifications of Tardigrade (the nuts and bolts stuff), you can check out our documentation. Here’s also a cool diagram to reference.As previously mentioned, with traditional cloud storage, all of your data resides in one large data center. This often leads to downtime and outages when one of these facilities goes offline. With decentralized cloud storage, we have a large, distributed network comprised of thousands of Nodes across the globe that are independently owned and operated which store data on our behalf. A Node is simply a hard drive or a storage device someone owns privately. We pay all of our Node Operators to store files for our clients, and we compensate them for their bandwidth. Think of it like this: You have a 10 TB hard drive, and you’re only using 1 TB. You could sign up to be a Node Operator and we would store pieces of our clients’ files on your hard drive utilizing your unused space. Depending on how many files you store and how many times that data needed to be retrieved, we’d compensate you accordingly.So why make it decentralized?The real issue with centralized providers like Amazon S3 is all of your data resides in huge data centers. If a part of Amazon’s network goes down, you won’t be able to access your data at best, and at worst, your data could be permanently lost or damaged. Large data centers are also vulnerable to hackers, as we’ve seen time and time again. With decentralized cloud storage, end-to-end encryption is standard on every file-each file is encrypted on a user’s computer before it’s uploaded, broken into pieces, and then spread out to uncorrelated Nodes across our network. Only you have access to the encryption keys, making it virtually impossible for your data to be compromised or stolen.Plus, centralized cloud storage costs a lot more than our decentralized network. Large data centers cost a ton of money and take a lot of resources to operate. The fact we don’t have to spend money operating a data center, but rather use individual, privately owned devices, means we pass those savings onto our customers.But what about data loss or bad actors on the network?We’ve never lost a single file on the Tardigrade network due to our decentralized architecture. Tardigrade has 99.99999999% file durability, and since we split each file into 80 pieces, and we only need 30 pieces to reconstitute a file, it would take 51 nodes going offline at the same time for your file to be lost. Complete files are fully retrieved at lightning speed by downloading the fastest 30 of 80 pieces. If you know how torrenting works, it’s the same concept. There’s no central point of failure, ensuring your data is always available. Since each file uploaded to Tardigrade is split into 80 pieces, encrypted, then stored on 80 different Nodes, one Node going offline won’t impact any files stored on any particular Node.The real beauty of the decentralized architecture lies in the fact that a Node Operator doesn’t know what files are stored on their Node. Even if a Node operator wanted to access your files, they only have a small shard or piece of that file. They would have to track down at least 30 other Nodes to reconstitute a file, and all of those files are also encrypted. It’s virtually impossible to compromise a file.Is it secure?Storj is what we like to call “trustless.” What does this mean? It means you don’t have to place your trust in any single organization, process, or system to keep the network running, and you don’t have to worry about your data because we couldn’t access it even if we wanted to. Tardigrade is private and secure by default, and files are encrypted end-to-end before they’re ever uploaded to our network, ensuring no one can access data without authorization.A file on Tardigrade is almost impossible to access without the proper keys or permissions. Because everything is encrypted locally, your data is literally in your hands, and no one else’s. After files are encrypted, they get split into smaller fragments that are completely indistinguishable from each other. A typical file gets split into 80 pieces, of which any 30 can be accessed to reconstitute the file. Each of the 80 pieces is on a different drive, with different operators, power supplies, networks, geographies, etc. For example, there are currently 171 million files on our Tardigrade service. To compromise a single file, the hacker would first have to locate 30 of its pieces among the 171 million on the network, creating a needle in a haystack scenario. Then, they would have to decrypt the file, which is extremely difficult-if not impossible-without the encryption key. Then, the hacker would have to do all this again to access the next file.So, there you have it, the decentralized cloud in a nutshell. If you’re interested in becoming a Node Operator, please visit www.storj.io. And, if you’re interested in trying Tardigrade for yourself, head on over to www.tardigrade.io.Originally published at https://storj.io.
20. 10. 16
NEM and IoDLT — Using Tardi...
NEM and IoDLT — Using Tardigrade to Accelerate MongoDB Snapshot Distribution and StorageIoDLT is a company that secures, records, and monetizes data from a wide array of sources, (including IoT devices).The company and wider NEM ecosystem have had difficulty making their blockchain data easy to access for application developers. By working with IoDLT to build a solution leveraging Tardigrade and MongoDB, we were able to build a decentralized, resilient solution that improves MongoDB sync performance, while decreasing costs vs centralized providersIoDLT has implemented the MongoDB integration for Tardigrade within their own stack to back up their important data to our decentralized cloud storage service. Tardigrade helps IoDLT optimize its workflow, decrease costs, and increase performance for developers building on their blockchain platform.By using Tardigrade to store and distribute the MongoDB snapshots needed to set up a Symbol node, the process is reduced from 44 minutes to 2–3 minutes, an almost 1,500% improvement,” said Bader Youssef, IoDLT Chief Technology Officer. “Our team is proud of the Tardigrade integration we built and we’re looking forward to seeing how NEM community members and users take advantage of these new benefits from Tardigrade.”In this post, we’ll give an overview of how this solution architecture works to extend the Symbol stack for NEM application developers, making chain data easier to access, download, and query using Tardigrade and MongoDB.How MongoDB and Tardigrade are Used in the Symbol Application StackIn the Symbol application stack, the MongoDB database is used as a secondary data store for the API Nodes which are queried by various client applications.The primary responsibility of the API nodes is to store the data in a readable form. NEM uses MongoDB in conjunction with a broker to push messages in real-time to the MongoDB instance, making NEM chain data more accessible and queryable.The MongoDB instance retains chain data-blocks, transactions, metadata state, and generally the entire chain. This architecture allows the client’s apps to access the blockchain data quickly through the REST nodes, such as this one.There’s an interesting problem that can arise from this architecture — as the chain progresses, the database will naturally get more full, increasing the time-to-sync for a full node.On smaller factor devices, such as devices run by IoDLT, this can pose a problem for running full nodes on devices with fewer compute and bandwidth resources.To help solve this problem, IoDLT has built a connector for Tardigrade that periodically snapshots the chain’s MongoDB instance to the decentralized cloud, making it easy to quickly rehydrate state to any location across the globe.How IoDLT Built a Data Stack Without Centralized SilosDevelopers who wish to try the combination of Storj and Symbol can try running a node on IoDLT’s Symbol devnet. This node will have all the normal functionalities of a full Symbol node, but with an instance of Storj’s MongoDB backup tool. The configuration automatically will back up a node’s MongoDB instance every day.We have a core Docker container that installs the connector-MongoDB tool. Once it’s installed, two other containers are run. First, a restore script is run on startup. This checks for any latest backups automatically restores the MongoDB instance to that backup.Next, another container utilizes a CRON job that runs the backup tool every day. This setup utilizes the Docker network to communicate between each process.When the node operator restarts the node, it grabs the latest “light” version of the chain so that data is made available in MongoDB. Meanwhile, the node syncs and verifies with the rest of the network. Once the node reaches the chain’s block height of the backup from Tardigrade, new chain data is made available for further backup.To get started, get a node running using these instructions: https://github.com/IoDLT/catapult-devnet/tree/feat/backupsSpeeding up Sync Performance with TardigradeOnce this node is running, the docker-compose setup includes Storj’s backup tool as part of the stack. Along with automating the Symbol node setup process, our configuration will automatically back up the MongoDB instance to the Tardigrade network. In the event that the node needs to be reset, or in the unfortunate case of a sudden shutdown, the node can pull down the latest backup from a Tardigrade bucket.This means client apps that heavily depend on data from MongoDB will be able to quickly get access to historic chain data without the long sync time. Let’s take for example the IoDLT Symbol Network — at the time of this article, it sits at 700k blocks. If a node operating on that network would experience a technical difficulty, there is a chance that the data on that system would be corrupted or otherwise inaccessible.By just relying on Symbol’s broker process, it would take roughly ~44 minutes to sync the chain (and for the data to be available). Whereas using Tardigrade, assuming a backup is available, the same data would be available for usage in only ~2–3 minutes, thereby providing a 93.33% reduction is sync time — an almost 15x increase in performance. The node still has to fully sync in the background, but now chain data can be utilized while that occurs. This, of course, will vary depending on the size of the chain and the content within, but generally, using Tardigrade greatly improves sync speed for API access.Easily Access Symbol API DataMany Symbol applications rely on direct data from the blockchain. Whether that’s getting an account’s Multi-Signature graph information, viewing a past Aggregate Complete smart contract, or querying for an account’s Metadata, client-side apps that utilize Symbol are dependent on the API portion being constantly up-to-date and in sync.This method of quickly reviving a MongoDB instance from Tardigrade allows for use cases such as a mobile app, to quickly get back on their feet without switching nodes.Why did IoDLT Choose Tardigrade over Centralized Cloud Providers?The best part of Tardigrade is that the service includes default features like end-to-end encryption on every file, that would otherwise cost organizations a large premium. Every file uploaded to Tardigrade stores data across statistically uncorrelated cross-geographic redundancy by default, which is similar to a multi-region cloud backup architecture.This means cloud storage backups are always accessible, and a single outage won’t cripple our operations like it would in a centralized cloud storage architecture. Tardigrade backups are available everywhere, and with hot availability, developers can quickly and easily rehydrate their MongoDB instance from anywhere in the world.If you are a MongoDB Developer and are interested in learning more about this API Node architecture, reach out to: email@example.com.Originally published at https://storj.io.
20. 09. 24
Choosing Cockroach DB for H...
By Krista Spriggs And Jessica GrebenschikovHere at Storj Labs we just migrated our production databases from PostgreSQL to CockroachDB. We want to share why we did this and what our experience was.TL;DR Our experience has convinced us that CockroachDB is the best horizontally scalable database choice in 2020.Why use a horizontally scalable database in the first place?Our top goal at Storj is to run the largest, most secure, decentralized, and distributed cloud storage platform. Our cloud storage holds its own against AWS S3 and Google Cloud Storage in performance and durability and also goes further by improving reliability since it’s fully distributed. In order to compete on the same scale as the big cloud providers it’s crucial we can scale our infrastructure. One of the ways we are doing this is by using a horizontally scalable database. To meet our first goal of storing an exabyte of data on the Storj network, the current architecture will store over 90 PBs of file metadata. Additionally, it’s vital that the Storj Network can withstand multi-region failures and still keep the network up and the data available. All of this is made relatively easy with CockroachDB!What about other horizontally scalable databases?We considered a number of different horizontally scalable databases, but for our needs, CockroachDB consistently came out on top.When considering a database that will horizontally scale there are three main categories of options:Shard your current relational database.Use a NoSQL key-value database, like Cassandra or BigTable.Use a NewSQL relational database, like Spanner or CockroachDB.Before the 2000s there weren’t any horizontally scaling database options. The only way to scale a database was to manually shard it yourself. Which is typically very tedious and kind of a nightmare. For example, it took Google over two years to shard some of their MySQL instances1. Quite the undertaking! No wonder Google came up with the first “NewSQL” horizontally scalable relational database, Spanner.In the early 2000s, NoSQL became all the rage since they were the first horizontally scaling options. However, NoSQL has some tradeoffs, mainly weaker consistency guarantees, and no relational models. And here we are in the 2020s, finally what we always wanted, which is the rise of the strongly consistent, relational, horizontally scaling database.What’s involved adding CockroachDB support to our application?Here is our process for getting CockroachDB running up with our application:Rewrote incompatible SQL code to be compatible with CockroachDB.Performance and load tested in a QA environment.Learned how to evaluate analytics about CockroachDB running in production.Migrated production environments off PostgreSQL and over to CockroachDB.Writing Compatible SQLOne of the first parts of integrating with CockroachDB was to make sure all of our existing SQL was compatible. We were already backed by Postgres and CockroachDB is Postgresql wire protocol compatible, so we simply replaced the Postgres connection string with a CockroachDB connection URL and observed what broke. At the time (around v19.2-ish) there turned out to be quite a few PostgreSQL things that weren’t supported. Here’s a list of some of the highlights:Due to some of these unsupported Postgres features, we had to rewrite our migrations. Additionally, when we needed to change a primary key this resulted in a more tedious migration strategy where you create a new table with the new primary key then migrate all the data over to it and drop the old table.While this process to make our SQL compatible was a bit more tedious than I had hoped, it ended up taking about a month of full-time developer time, I still feel like it was much easier than migrating over to spanner or another database without postgres compatibility. Additionally since then, now in CockroachDB v20.1, many compatible issues have been resolved. CockroachDB is under fast development and is constantly listening to feedback from end-users and adding features per requests.End-to-end testing, performance and load testingOnce we had all the compatible SQL in place and all our unit tests passed, we then deployed to a production-like environment for some end-to-end testing and performance and load testing. Out of the box some things were faster than GCP CloudSQL Postgres, while some things were a teeny bit slower.Performance Improvement MetricsOne of the database access patterns we use is an ordered iterator, where we need to walk over every record in the database and perform some processing on each row. In our largest database with over six million rows, this iteration was getting very slow with CloudSQL Postgres database, taking about 13 hours, which was way too long. After we migrated to CockroachDB, processing every row in order was down to two hours!Additionally, we wrote a smaller script that emulated our production iterator access patterns, but in a more isolated coding environment and got the following results when performing processing on each row in order. CockroachDB was much faster. We think there are a few reasons for this improved performance, one being the data in the CockroachDB cluster is split across many nodes and therefore increases read throughput.Speed of ordered iterator: CockroachDB Took 3.5s for 100,000 records Took 18.8s for 1,000,000 records Took 14m0.5s for 10,000,000 records CloudSQL Postgres Took 56.8s for 100,000 records Took 4m53.3s for 1000,000 records Took 1h48m25.1s for 10,000,000 recordsAnother awesome feature of CockroachDB is prefix compression. CockroachDB data is stored in sorted order by the primary key and any prefix shared with the previous record is dropped2. This saved a lot more space than we expected. While the data stored in CockroachDB is replicated three times (by default), the additional bytes on disk was just a little over two times Postgres since the prefix compression saved quite a bit of space.Prefix compression: CloudSQL Postgres 239 GB 65,323,332 rows ~3658 bytes/row The same database ported to CockroachDB 186 GB 65,323,332 rows ~2846 bytes/rowEnd-to-end TestingWhile end-to-end testing, there were three main issues we encountered:Retry errorsTransaction contentionStalled transactions that never completedRetry ErrorsEverything in CockroachDB is run as a transaction, either an explicit transaction if the application code creates a transaction or CockroachDB will create an implicit transaction otherwise. If the transaction is implicit and fails, then CockroachDB will retry for you behind the scenes. However, if an explicit transaction is aborted/fails then it’s up to the application code to handle retries. For this, we added retry functionality to our database driver code like so.Transaction ContentionWe experienced much more transaction contention with CockroachDB and therefore aborted transactions and also slower performance with some of our database access patterns. The following changes greatly improved these issues:Use a smaller number of connections, fully saturated to help eliminate contention and improve throughput. This is especially helpful when there are many connections reading/writing from the same range.Multi-row upserts (reference code), instead of updating one row at a time, sends a single request with many upsert statements together.Blind writes to help reduce contention (reference code).Bulk inserts.And more on CockroachDB docs.Stalled TransactionsWe encountered some unusual behaviors where we had long-running queries taking over two hours sometimes.-- run query from CockroachDB CLI to see age for active queries > SELECT node_id, age(clock_timestamp(), oldest_query_start::timestamptz), substring(active_queries, 0, 50) AS query FROM [SHOW SESSIONS] WHERE oldest_query_start IS NOT NULL ORDER BY oldest_query_start ASC LIMIT 10; node_id| age | query 1 | 02:24:29.576152 | INSERT INTO example_table(id, first, last... 2 | 02:23:59.30406 | INSERT INTO example_table(id, first, last... 3 | 02:23:51.504073 | INSERT INTO example_table(id, first, last... 4 | 02:23:35.517911 | INSERT INTO example_table(id, first, last... 5 | 02:23:21.543682 | INSERT INTO example_table(id, first, last...Another way to see slow query times is:SELECT * FROM [SHOW CLUSTER QUERIES] WHERE start < (now() - INTERVAL '1 min');Luckily we run our databases on CockroachCloud, so the Cockroach Lab’s SREs and support team came to the rescue! They worked hard with us to debug this persistent stalled transaction issue that reared its head when we put the system under heavy load. It turned out that there was a recent change that merged into v19.2 where a potential downside is that UPDATE-heavy workloads experiencing high contention on many keys may have worse performance (up to O(n²)). This seems to be the downside we hit. To solve this problem they deployed a patch to our cluster and now in v20.1 the fairness is adjusted when transactions conflict to reduce latencies under high contention.While this ended up being a one-off bug with the re-scheduling of transactions that CockroachDB has already implemented a fix for in v20.1, I think it’s an interesting experience to share because it displays a number of the reasons I love CockroachDB; .they work hard and fast to build a high-quality product and they consistently offer top-notch technical support.Migrating the data during a maintenance windowOnce we had completed all of our testing we started migrating the actual data to CockroachDB. Luckily we were able to verify the migration steps and estimate how long it might take us for each of our deployments. We made the choice to make use of one of our maintenance windows which are Sunday 2 am-6 am eastern time. With the teams ready we began the migration steps. In order to ensure we didn’t miss any data, we had to stop all services within the deployment to stop all transactions. The next step was then to take a dump of all the data, with databases ranging in size from 110 GB to 260 GB. After waiting for the data dump to complete we then sorted the data to maximize import efficiency when importing to CockroachDB. The sorting steps took the longest, between 40 and 75 minutes. A small misjudgment on the size of these intermediate VMs meant that this step ended up taking significantly longer than we had estimated. With the sorting completed we then uploaded each snapshot to a cloud storage bucket and prepared our credentials to be passed to the CockroachDB servers to access the data dumps. The imports themselves took between 40 and 75 minutes as well. Once we had all the data imported we validated several tables to ensure the data had indeed been successfully imported, and then made the configuration changes for our application to be able to talk to its new database backend.Adapting Existing Tooling to CockroachDBTo ensure we have all of our bases covered we have a lot of automation around our deployments. One of these pieces is a backup step that ensures we have a snapshot backup of the database before we run any migrations. With the managed services for PostgreSQL that we’ve used in the past, we’ve always had an administrator API we can hit to trigger an automated backup. Cockroach Cloud already does a great job at running periodic backups as part of their managed service, but our requirements state that the backup we take before deployment is as close in time to the migration step as possible.We modified our existing scripts to now execute the CockroachDB backup commands directly in the database to trigger a full backup to one of our own storage buckets. Once these backups are complete we can then move forward with our existing deployment pipeline.Google Spanner white paperRocksDB, underlying kv store of CockroachDB, details of prefix compressionOriginally published at https://storj.io.
20. 08. 11
Remote but Not Alone
The significant shift toward remote work brought on by the pandemic looks like it’s going to have a lasting effect mainly because of the long list of benefits, from increased productivity to general employee satisfaction. While the shift to remote work has a lot of great things going for it, one thing most newly remote workers will miss is the opportunity for face-to-face interaction. I’ve been fully remote for the last two years and for me, that was probably the biggest adjustment. Millions of Zoom meeting memes over the last few months make me think I’m not alone on this one.1-on-1 meetings are a staple of effective management, whether they’re with direct reports, skip-level team members, peers, or mentees. 1-on-1s are an area where remote work actually makes things more challenging, especially when so much communication is non-verbal.That quick chat over lunch requires a Zoom meeting. You can’t step out to grab a coffee when you’re timezones apart. But — like all the other adaptations for remote work — there are ways to make remote 1-on-1s as effective as possible.Our team at Storj Labs is remote-first, meaning we’ve built all our processes with remote work top of mind. With team members in 14 countries, we’ve developed and tried a wide range of tools, techniques, and processes to make our remote team productive and happy. We use 1-on-1s as an important tool to keep our distributed workforce aligned, particularly across atomic teams that frequently include team members from different countries and time zones. 1-on-1 meetings can help keep employees connected, build stronger relationships, drive culture, and develop mutual understanding. Whether your 1-on-1 is a regularly scheduled meeting with a manager or peer or an impromptu meeting, here are some tips on how you can get the most out of 1-on-1s when you can’t meet in person.1. Allow spaceA 1-on-1 session should feel like a positive, safe space. This is the time to listen to your team members talk about their own challenges and learnings and work together to solve problems at hand. Get to know your colleague on a personal level; this is vital for building trust and connections. 1-on-1s shouldn’t be used for corrective action or discipline, but constructive feedback is healthy.2. Don’t make it about tasksWhen you and a team member are passionate about your company and project, it’s so easy to use 1-on-1 time to talk about projects in flight. Too often, 1-on-1s can become hijacked and turned into working meetings or formal course corrections, rather than time for open, honest communication. Getting too far into the weeds negates the goal of creating a positive, safe space. You don’t want employees to start dreading meetings or miss out on the opportunities 1-on-1s offer to foster better communication.3. Ask questions1-on-1s are a great opportunity for active listening. I personally try to listen more than I talk. I always ask some variation of two questions when meeting with someone on my team. The first is a general “How are you doing — overwhelmed, underwhelmed or just whelmed?” and the second is “What can I do for you?” Asking questions like this helps identify blockers or risks our team members are going through and also alerts us to any fires that may need to be put out.4. Arrive with a plan, but allow for flexibilityI always try to come prepared with a rough agenda for an upcoming 1-on-1 and encourage my colleagues to do the same. Most of my team members either send a list of topics in advance over slack or keep a running agenda. I try to limit the topics that are either covered in other meetings or more project oriented. I also find it helpful to keep some time for just generally checking in on how the other person is doing — to hear about successes, blockers, or watch for signs of stress and burnout.5. Meet with people you don’t work withI find skip-level and cross-org 1-on-1s are an effective way to help build valuable relationships and understanding across departments. By making a point of doing a 1-on-1 a week with someone I don’t often work with, I’m able to have a better understanding of the challenges facing other teams. Many times, we can easily help others solve the problems they’re facing but we lack awareness of the challenges, and they don’t know how to use you as a resource. Meeting with your more junior team members can give them context about the problems they’re working on and how they impact the larger business, as well as help them feel part of something great. After all, as a remote company, we can’t bump into each other in the hallways, so these 1-on-1s create needed connections and information cross-pollination.6. Try not to rescheduleWe’re a startup with a newly launched product in growth mode — and it’s a challenging environment for any company right now. It seems there are always high priority customer and media calls, and schedules are frequently juggled. With 1-on-1s, it’s important to keep a regular cadence of meetings, but if you can’t make it for a valid reason, reschedule immediately — but don’t make it a habit. Internal meetings tend to be the easiest and first meetings to move, but it’s important to have consistency with 1-on-1s, and especially important to make sure the meetings happen and don’t get pushed to the next week or repeatedly rescheduled. Treating the meetings like a high priority clearly communicates that you value the time.7. Maximize your access to nonverbal cuesSometimes chats in Slack or email get locked in patterns of miscommunication, especially where people are passionate about a particular point or where topics are emotionally charged. The majority of communication is nonverbal and written communication tools — like chat and email — are pretty much doomed to fail in highly charged communications when the only non-verbal tool in chat is typing in ALL CAPS or emojis! Breaking away from emotional exchanges in writing, and taking advantage of non-verbal cues through an actual phone or voice call or video chat can dramatically increase the probability of successful resolution.The shift to remote work is new for many organizations. Adapting some of the tried and true tools like 1-on-1s can significantly increase the probability of achieving the many benefits of remote work. I feel strongly — and have seen first-hand — the way remote 1-on-1s can drive culture when executed well.By John Gleeson on Remote WorkOriginally published athttps://storj.io/blog on July 24, 2020
20. 07. 24
Managing Remote Teams
Managing a remote team might be an entirely new concept for you. Because of COVID-19, many of us are working from home full-time, for the first time, and will be doing that for the foreseeable future. At Storj Labs, most of our team is already remote — we have employees in 26 different states and 13 different countries, so remote work and managing remote teams isn’t a new concept for us. We currently have three satellite offices, but coming into the office is optional. With that said, we’d like to share what we’ve learned over the years of managing remote teams with people who may be relatively new to the concept because of the pandemic.What is Remote First?You may have heard a lot about remote-first teams recently. In case you don’t know what that means, we’ll explain. Remote first means that the company’s organization complements working in remote environments. All information is shared online. There are processes to ensure that remote employees are not unintentionally treated as second-class citizens. All meetings take place over video chat with the ability to record if needed, even if the majority of attendees are in the same office — Slack, Zoom, and confluence are go-to tools in our remote-first workplace. You can read more about remote-first work cultures in a previous blog in this series.Here are a few examples of some challenges remote managers might face, as well as solutions.Challenge #1: Productivity anxietyIf you’re managing a team remotely, you may worry your direct reports aren’t working, or that they’re not getting their work done efficiently. You can’t just pop by their desk to see if they’re looking at their Twitter feed. Rather than assessing productivity based on how many hours someone is at their desk, remote working lends itself to naturally evaluating your team members based on their work output, regardless of how much or how little time someone spends completing a task. In our opinion, this is a much better way of measuring productivity, because at the end of the day, who cares about the number of hours clocked as long as the work gets done well.You can also add more accountability from both the managerial side and the employee side with goal setting. We’re somewhat new to goal setting ourselves, but here are a few things we’ve learned.Do the goals being set tie to personal/professional development and growth, or are they solely focused on objectives and key results (OKRS)? It’s important to strike a balance between the two for an employee to feel that their goals aren’t tied exclusively to one thing.Use goal-setting as a way to address deeper issues regarding a person’s personal work experience. As we said, it’s important to focus on OKRS, but strike a balance. For example, encourage your employees to block off time for “family” or “personal” time. If your direct report is experiencing stress from working at home with kids present, seeing “personal time” or something similar on their calendar will remind others that this time should not be interrupted. Being able to strike a balance between work and home will be key in making sure your employees are more productive and can achieve their goals.Challenge #2: DistractionsUnfortunately, working from home brings a whole new set of distractions that aren’t ideal for work environments. Whether it’s family members, pets, bad internet, a comfy hammock calling your name, your favorite TV show, or video games — there are a lot of distractions you probably wouldn’t have in an office setting. The first place to start is to try and create an actual workspace. If your company offers computers, desks, or other office supplies, that’s a great place to start. A dedicated workspace will help an employee feel more focused and less apt to be distracted by the trappings of “home.” If your company can provide reimbursement for internet costs or even child care, these are super valuable options to offer employees in a remote-first company. We know that COVID-19 has prevented child care at the moment, but this won’t last forever, so it’s an important option to consider.It’s also important to remember if your employee is working from home, they may need a little more flexibility than an in-office situation. Your staff may need to tend to pets, pick up children from school, take care of family members who are ill or elderly, and more. Be more flexible with their schedules as long as they communicate these needs clearly with teammates and fulfill their project commitments.Challenge #3: Communication (or lack thereof)When you don’t see your teammates and managers every day, employees may struggle with seemingly less access to managerial support, accountability, and overall communication. They may also feel they are missing out on important information that was only communicated in a meeting they weren’t in, or between people “at the water cooler.” Also, cultural differences and language barriers could be a concern depending on your organization.The most important step you can take as a manager is to build trust by communicating more often. Check-in daily to see if you can support your team’s efforts, or try to facilitate communication between employees if necessary. Make sure to listen and ask questions that help people feel comfortable to open up about matters that might not be going well. If they are feeling isolated, make sure to acknowledge any feelings of isolation your team members may express.Another solid idea is to have more frequent 1:1 meetings. We suggest weekly or bi-weekly depending on the need. During 1:1 meetings allow time for communication about personal experiences, and make sure to remind people it’s ok to take a break and address family needs, personal needs, or take a mental health break — work is important, but burnout is real, so be flexible. During 1:1 meetings, establish team norms. Set ground rules around availability, Slack messages, and make sure to gain clarity around when someone is available or away focusing on work they don’t need to be interrupted. As a manager, you can help set the tone by not slacking people late at night, or even displaying work hours for different time zones if your workforce is in different countries.Besides 1:1 meetings, it’s vital to try and establish a work culture and build relationships with the people you manage. It’s much harder to do when you don’t see each other every day, but we’ve found a few ways around in-person interactions and team building. Host a monthly social hour during work hours via Zoom, Google Meetings, or your preferred online meeting method. During this time, just let people chat, play games, share a beverage online, whatever. The important thing is to socialize and try not to talk shop. Another effective way to help people feel more involved is to hold quarterly all-company meetings. At Storj, we generally bring the whole company together from all over the world to gather and interact, learn and solve problems. Obviously, we can’t do this at the moment, but even having an online all-hands lets everyone, in every department, gather together, see and speak with one another, and feel like they are part of a community.This pandemic won’t last forever, but we hope that during this time of increased remote work, these tips will foster a better online culture for your workforce. We’re still learning, like the rest of you, but we did feel we had a little bit of a leg up on the whole remote thing since many of us have been doing it for years.This blog is a part of a series on remote work. In this series, we share examples of how we address remote work at Storj Labs, where the majority of our team works remotely. To view the rest of the blogs in this series, visit: https://storj.io/blog/categories/remote-work.By Katherine Johnson And Jennifer Johnson on Remote WorkOriginally published at https://storj.io on July 22, 2020
20. 07. 23
Relationship Building Over ...
Storj Labs decided early on that Tardigrade.io, the leading globally-distributed cloud storage platform should be built by an equally distributed team. As a result, Storj now has workers spread across 24 cities in 13 countries. While we definitely know a thing or two about effectively managing remote teams, it doesn’t come without its challenges. Despite remote work becoming more popular and widely adopted — especially in a post-COVID-19 world — there are significant trade-offs, both for the company and the employee.In Buffer’s 2019 State Of Remote Work report, loneliness and collaboration ranked among the top three struggles for remote workers. This seems like a no-brainer, right? The basic definition of “remote” refers to faraway isolation, so feelings of alienation are expected. Regardless, the research and data on interpersonal connection at work are widespread and abundant — see here, here, and here — and it all comes to the same conclusion: social interaction is a key contributor to overall well-being, which directly impacts employee engagement and productivity.While remote work can never perfectly simulate in-office life, it’s important for companies to be intentional about making remote work feel less, well, remote. Many companies — Storj included — host a few company retreats each year in which all employees are brought together for face-to-face collaborative work and recreation. While these retreats are powerful and invigorating, they can’t be expected to make up for the other 48+ weeks of isolation.The best (and most simple) answer for building relationships and increasing engagement among remote employees is to leverage (and get creative with) what the company already has on hand: video conferencing tools. The aim of this blog is to provide some concrete ideas for how your team can best use video conferencing to improve employee engagement and connection.Action Items for Increasing EngagementDefault to VideoA great place to start is to simply make video conferencing the consistent norm for your team. This won’t replace email or chat-based communication, but defaulting to more face-to-face interaction, albeit mediated, can break up the isolation and create opportunities for connection and casual conversation. While it might be difficult to enforce as a policy, perhaps try to make an unspoken rule of taking the first few minutes of a meeting to chit chat with your colleagues. As always, be respectful of time, but don’t take an opportunity to laugh and unwind for granted. It’s striking to see just how much of a pick-me-up a light-hearted conversation can be.With all this being said, it’s important to remember that video meetings feel very different than in-person meetings and can have some unintentional effects. As such, it’s well worth your time to observe and gauge the effectiveness of your remote meetings and adjust accordingly.Intentionally Let LooseWhen it comes down to it, setting aside a specific time to connect with other remote colleagues is one of the better things to do. Whereas in-office life is replete with opportunities to chat and recreate with co-workers, remote work easily creates an environment of “all hands on keyboard, all the time.” Intentionally blocking off time to set projects aside and socialize is a great way to boost this necessary interaction. Due to the direct impact on employee satisfaction and overall productivity, it would be wise for leadership at the highest levels to proactively support and encourage such e-gatherings.Right now you’re probably saying “OK, but what should we actually do during those times?” The ultimate goal is to provide several opportunities each week for people to connect, not impose more demands on them. As such, prioritize low friction over perfect attendance.Here are a few suggestions:Virtual Social/Happy HoursStorj Labs has recently rolled out a weekly “Storjlings Social Hour” on Friday afternoons. There is nothing structured or formulated during these meetings — people simply show up and enjoy the chat! There isn’t anything sacred about holding these meetings on Friday, so do what works best for your team. For example, you might find out that people would prefer to have a mid-week get-together to get them over the hump and finish the week strong. There’s also no reason to not hold multiple social hours throughout the week.You’ll find that sometimes you and others might contribute significantly to the conversation, while at other times you keep on mute and just enjoy the banter. Regardless of the participation level, the underlying idea is to set projects aside for a bit and enjoy the respite. For many, cheerful voices and the opportunity to laugh are deeply therapeutic and rejuvenating. Though our Social Hours are actually only a half-hour long, the spike in energy they offer is incredible.Ever-Present Water CoolerOne idea that’s worth trying is having a video conference meeting in the calendar that’s open all day. Instead of a dedicated time slot for people to gather, the “water cooler conference” provides an opportunity for folks to pop in at random times of the day when they need a moment to rest their brains. If you’re wanting to mimic the spontaneity of in-office chit-chat and respite, this is a great option.While it might add an extra step, it could be worth creating a Slack (or another chat program) channel to alert when folks are at the water cooler. In a similar fashion, Storj has a channel called #weasley-clock to alert the team when someone is stepping away from their computer to grab lunch or run an errand. As such, a water cooler channel wouldn’t be a stretch since such open and transparent communication is central to our culture.Game TimeThe appropriateness of playing games at work is still a matter for debate, but a host of research and data shows games are great for relieving stress and building camaraderie. Any executive would agree managing stress loads and increasing cohesion among employees are great objectives for the company; not only are they good for employee satisfaction and morale, but they can also have a direct impact on productivity (read: the bottom line). As such, why not break down some of the traditional assumptions about work and make things a bit more jovial?The internet provides ample opportunities to play games with people all over the world. Some obvious examples are the online multiplayer functionality through the leading game consoles in addition to popular PC-based services like Steam and Stadia. You can also play several classic board games online as well. While these are great options, they require everyone to own the same consoles, software, and games. As such, these methods won’t serve as universal options for everyone. Luckily, there’s another option.It has become much easier (and popular) to play games through video conference tools. In fact, some video chat programs like Houseparty come with built-in games! Because these tools provide a face-to-face element, party games are often the most common. Jackbox, for instance, offers a selection of “party packs” that each contain a suite of fun and engaging games. There are also some great digital escape rooms like this one for Harry Potter fans, and the Random Trivia Generator has become a go-to for trivia lovers.There are so many different options available for folks who want to connect with others through online games. As with anything, it comes with some trial and error and the occasional technical difficulty. While any video conference tool would work just fine, Zoom’s screen and audio sharing capabilities are a bit more optimized for video games. Also, don’t forget that you’re not confined to video or web-based games. Since everyone has a camera and mic, “in-person” party games like charades are possible as well. Some (brave) folks have even hosted karaoke and talent shows! The options are endless.Final ThoughtsAs stated before, the main goal here is to increase employee satisfaction and engagement by providing remote workers with opportunities to connect and build relationships with one another. However, it’s important to remember games and social hours aren’t enough on their own. Instead, they should be supplemental actions woven into a greater culture of engagement and inclusion. While it’s always good to start somewhere, we encourage businesses and leaders to focus primarily on building a firm foundation of transparent communication, employee empowerment, and purpose.Storj Labs is excited about the future of work, specifically the increased adoption of remote teams. We’re also excited about tearing down walls and barriers at work that hinder people from building impactful and fulfilling relationships. We would love to hear what your team is doing to bring your teams closer together.This blog is a part of a series on remote work. In this series, we share examples of how we address remote work at Storj Labs, where the majority of our team works remotely. To view the rest of the blogs in this series, visit: https://storj.io/blog/categories/remote-work.By Ben Mikell on Remote WorkOriginally published at https://storj.io on July 20, 2020
20. 07. 21
How to Maintain Productivit...
It’s difficult to be productive in even the best circumstances, but the current state of the world presents unique challenges. At Storj, the majority of our team works remote since day one, so we’ve had extra time to experiment with what works best in a distributed and remote environment.Here’s our best advice for maintaining productivity while encouraging a healthy work culture in a remote work environment:Get the timing rightGet the time zones right. Storj’s workforce consists of about 50 people working across 20 cities in 11 countries, so we have many different time zones to work around. We don’t mandate everyone be available to work at specific times in the time zone because it would put too much strain on our employees: after all, noon in Utah is midnight in New Delhi. Instead of trying to force everyone into the same time zone, we’ve organized teams around timezones, while trying to avoid setting company-wide meetings outside of an hour or two window where everyone can convene.We’ve adopted this stance specifically to widen our opportunities in hiring, which means we aren’t constrained to hire by time zones or locations. Instead, we look purely at skill sets, going as far as making the application phase of our hiring process anonymous. This has allowed us to hire some of the best developers and employees around the world, and developers in locations that are much less competitive than the Bay Area, for example. By widening our net, our team is more diverse, which increases job satisfaction, company success, and employee retention.Incorporate different solutions for different kinds of workersAt Storj, we make very deliberate choices around synchronous vs. asynchronous methods of communication in an effort to respect the ways our employees work best. Synchronous meetings utilize tools such as Google Meet or Zoom to hold live, interactive meetings. But those meetings are disruptive and not nearly as efficient as they should be (we record them, so people who could not attend and participate can still at least observe).Asynchronous tools allow for employees to address issues at their own pace. Email, design documents, code review, ticket comments, and discussions, etc., are all examples of asynchronous communication; an employee can choose when to read messages. The main thing about asynchronous communication is the participants of the conversation should not all have to be active at the same time. This is especially useful for people who use their working hours to create valuable additions to the business; they’re much more productive when they’re not interrupted for meetings throughout the day.On the other hand, there are still times where synchronous work is warranted:Managers are often event/interrupt-driven and need synchronous solutions where they can decide how to respond immediately.An asynchronous culture of writing also requires a culture of reading, and sometimes it’s simply more effective to sit someone down and talk through a document together.You can save hours of video chatting by spending weeks of passive-aggressively leaving code review comments! Haha, just kidding. Programming is fundamentally a work by people for people. The best way to get on the same page about a plan is to spend a couple of minutes using more communication cues than text.Brainstorming or spitballing ideas is always important. Some people do this very effectively with long-form writing, while others don’t.Extra adviceThere are plenty of well-intentioned articles out there offering advice for maintaining productivity at home, but most of it hasn’t been helpful or doesn’t feel applicable to many. Why, for instance, do you need to get dressed like you’re going into the office when you’re really just going into the spare bedroom? For some people, this may be what works, but not all of us. The key point is remote employees need to make a mental shift between home time and work time.Here’s some additional advice we’ve found helps:Get a workspace — Even if it’s just a portion of a room separated by a sheet. If you don’t have a separate workstation, it can feel like you’re constantly working and never able to recharge.Schedule breaks — Remote workers may find that their coworkers lack context about their day and schedule. For example, you may have a 30-minute meeting with one coworker who doesn’t realize you just finished a 2-hour sprint planning meeting. Many studies have shown scheduling breaks can help you be much more productive.Consider your air — In an office, someone else determines the temperature, humidity, light, etc., so it’s likely not top-of-mind. But in a home office, you’ll need to figure out the indoor climate that’s best for you. Are you closed into a little room, or does your HVAC system circulate air well? I found myself losing steam in the afternoons and after a great deal of investigation, I discovered that it’s because I was working in a room with the door closed all day — my workspace’s oxygen levels were measurably decreasing. I opened the door a crack and everything got better!Get a plant — In addition to filtering the air, plants are just — well, nice. It’s great to have something green and vibrant near you as you work. If a plant isn’t your idea of a good time, at least consider investing in an air monitor to make sure your air quality is as ideal as possible.Be consistent — Working from home has a tendency to let your work bleed into the rest of your life and time. Being predictable about your work schedule is one of the most important things you can do, especially if you live with others. Leaving the office is easier when you have the social cues of everyone else leaving. At home, it might sometimes feel like you’re just trading one screen for another. Consider setting an alarm and make a hard stop for your work efforts so you can be predictable to yourself, your family, and roommates.Make friendsIt’s difficult to form close friendships with people when you only talk about business. That’s why we’ve made an intentional point to schedule remote happy hours. Anything you can do to foster friendships will pay off; you’ll have happier employees, less miscommunication, better collaboration, and more inspiration. Investing in people always pays off.As one final note, most of this advice was formed before the pandemic, before the protests (see our blog post about George Floyd and racial injustice here), before the murder hornets, and before, well, 2020. Now, everyone is quarantined during a frankly frightening continual news cycle, trying to work. It’s a very different situation. People who’ve been working from home for years all agree this time period is very different even for them. Please make sure to give yourself and your coworkers understanding, empathy, and space as we all deal with these frankly hard-to-believe times together.By JT Olio on Remote WorkOriginally published at https://storj.io on July 15, 2020
20. 07. 15
Remote-first Versus Partial...
COVID-19 has forced nearly every company to adopt some semblance of a work remote culture. It’s great to see companies take the initiative in flattening the curve through social distancing. However, the work remote culture your organization adopted in response to COVID-19 is necessarily not a remote-first work culture.At surface level, it may seem like an organization has a remote-first work culture. Nearly everyone in the company works remote. All meetings are held using chat tools like Zoom and Google Hangouts. Maybe no employees have seen each other in person in months (and will likely not see one another for more months to come). None of this necessarily makes your organization a remote-first culture.Remote-first vs partially-remote work culturesThere’s a big difference between being a remote-first company and being a remote-supporting (or partially-remote) company. A remote-supporting company may allow employees to work remotely, but the culture will lead others to treat them as second-class citizens — even if the majority of the company is forced to work remotely in light of COVID-19. If someone is invited to meetings, it may be only as an afterthought. Often, in remote-supporting companies, remote employees are less visible and may be the first to be let go during a downturn.However, in a remote-first company, all solutions are incorporated to allow the best possible experience for remote employees. Remote workers aren’t afterthoughts, nor are they seen as secondary to in-person employees. They’re essential members of the team and are given every resource they need to thrive. The COVID-19 crisis forced many companies into remote-supporting status. With no end in sight for the pandemic, these organizations will need to begin thinking more like remote-first companies in the coming months.Decide to be intentional about remote-firstThere are many things to consider if you want your company to be remote-first. Do remote employees have the same opportunities to socialize and converse with coworkers? In other words, are you proactively fostering an environment of interoffice relationship building? In remote-first environments, managers and employees have to seek out what might otherwise be chance encounters.Does everyone have the same experience in a meeting, or are remote employees struggling to hear side conversations of the people in the meeting room? Do remote employees have to consistently advocate for themselves to be remembered or are there clear systems in place to make sure the right people are always invited to appropriate meetings, conversations, and decisions? Is it disruptive in any way if a “non-remote” person decides to become remote? If so, then you probably are still leaning too much on being in-person. The main benefit of remote-first is every employee not only gets to work from home, but can work from anywhere, and nothing about their role within the company, either implicit or explicit, really changes. Take meaningful steps to ensure all team members are included and communicating effectively, and you’ll be well on your way to creating a successful remote-first environment.If you’re coming from an office that isn’t remote-first and you want to adopt a remote-first environment, you must be intentional about everything you do and take active, measurable steps towards this goal. Being in the same room as your team brings a lot of side benefits unintentionally, such as social dynamics and cohesion, culture, inclusion, and advocacy (who is top of mind for the manager during promotion time?) etc. Remote-first work requires an intentional and deliberate strategy to ensure your team can benefit from all of these things while working towards your shared goals.Lay the foundation for effective communicationIn a remote-first working environment, it’s important to facilitate communication and collaboration between the company in a way that is natural and human. This should be a core emphasis of any remote-first organization’s culture.Today, a lot of emphasis is placed around productivity tools and communication stacks. In reality, there’s little functional difference between video conferencing tools such as Zoom and Google Meet, or productivity tools such as Slack and Teams. Technology can help optimize and guide productivity, but at the end of the day, the tools don’t matter — it’s the people and their ability to comfortably and effectively communicate, as well as their outcomes, that should be measured.To make a culture of effective communication possible, it’s important to proactively encourage interpersonal relationships amongst everyone and to promote inclusion around various work-style preferences.Chance encounters and unplanned meetings lead to valuable outcomes in office-driven environments. The headquarters of Pixar and Apple was specifically designed to encourage chance run-ins. The digital analogy for this would be creating #random Slack threads around shared interests, scheduling digital happy hours on Fridays, and encouraging digital workflows that help make people comfortable scheduling collaborative meetings on the fly.It’s important to be intentional about creating remote-first spaces to encourage relationship building and digital “run-ins”. Having deliberate social time is important! These spaces will help foster a digital sense of community and inclusion among team members, allowing them to work more efficiently, collaborate more effectively, and have a better experience overall.Creating spaces for both collaborative and asynchronous working stylesIn remote-first business development roles, some employees may live by the calendar. An open calendar across the organization makes it easy to see whether people are busy with a prospect, meeting with others, or are able to join a call to help solve a problem or provide input.As a remote-first worker, an open, cross-organizational calendar makes it easy to know when one can pick up the phone and call a coworker for an urgent issue, or if one should hold off and communicate asynchronously via Slack or email.An open calendar also makes it easier for an employee to dedicate time for deep and focused sessions, preventing distractions and interruptions. Open-calendar culture allows anyone to block off time in order to focus on activities like integration testing, writing, or even personal wellness activities like meditating or working out.We hope these suggestions help you and your organization on its journey to being a remote-first workplace. Share your experience with remote work below.This blog is the first in a series on remote work. In this series, we share examples of how we address remote work at Storj Labs, where the majority of our team works remotely. To view the rest of the blogs in this series, visit: https://storj.io/blog/categories/remote-work.By JT Olio and Kevin Leffew on Remote WorkOriginally posted at https://storj.io on July 13, 2020
20. 07. 15
Changing the Security Parad...
By John Gleeson, VP of Operations at Storj LabsWith the paint barely dry on our production release of the Tardigrade Platform, one of the areas where we’re seeing the strongest interest from customers and partners building apps is our security model and access control layer. The security and privacy capabilities of the platform are some of the most differentiating features and they give our partners and customers some exciting new tools.Distributed and decentralized cloud storage is a fantastic way to take advantage of underutilized storage and bandwidth, but in order to provide highly available and durable cloud storage, we needed to build in some fairly sophisticated security and privacy controls. Because we had to build with the assumption that any Node could be run by an untrusted person, we had to implement a zero-knowledge security architecture. This turns out to not only make our system far more resistant to attacks than traditional architectures, but also brings significant benefits to developers building apps on the platform.Decentralized Architecture Requires Strong Privacy and SecurityFrom the network perspective, we need to make sure the data stored on our platform remains private and secure. At the most basic level, we need to ensure that pieces of files stored on untrusted Nodes can’t be compromised, either by accessing that data or preventing access to that data. We combine several different technologies to achieve data privacy, security and availability.From the client side, we use a combination of end-to-end encryption, erasure coding and macaroon-based API keys. Erasure coding is primarily used to ensure data availability, although storing data across thousands of statistically uncorrelated Storage Nodes does add a layer of security by eliminating any centralized honeypot of data.By way of example, when a file or segment is erasure coded, it is divided into 80 pieces, of which any 29 can be used to reconstitute the (encrypted) file. With our zero-knowledge architecture, any Node Operator only gets one of the 80 pieces. There is nothing in the anonymized metadata to indicate what segment that piece belongs to, or where the other 80 pieces are etc. It’s worth noting that 80 pieces is the minimum number of pieces for a single file. Files larger than 64MB are broken up into 64 MB segments, each of which is further divided up into 80 pieces. A 1GB file for example is broken up into 16 segments, each with a different randomized encryption key, and each broken up into 80 pieces, for a total of 1,280 pieces.If a hacker wants to obtain a complete file, they need to find at least 29 Nodes that hold a piece of that file, compromise the security of each one (with each Node being run by different people, on different Nodes, using different firewalls, etc.). Even then, they would only have enough to reconstitute a file that is still encrypted. And, they’ll have to repeat that process for the next file, and for files larger than 1GB, every segment of a file. Compare that to a situation (e.g. what was seen at Equifax a few years ago), where a simple misconfiguration gave access to hundreds of millions of individuals’ data, and you’ll see the power of this new model.Just storing data on the Tardigrade platform provides significant improvements over centralized data storage in terms of reducing threat surfaces and exposure to a variety of common attack vectors. But when it comes to sharing access to data-especially highly sensitive data-developers really experience the advantages of our platform. Where we’re already seeing the most interest from partners on the combination of end-to-end encryption and the access management capabilities of our API keys.Separating Access and EncryptionOne of the great things about the Tardigrade Platform is that it separates the encryption function from the access management capabilities of the macaroon-based API keys, allowing both to be managed 100% client-side. From a developer perspective, managing those two constructs is easy because all of the complexity is abstracted down to a few simple commands. What this enables developers to do is move access management from a centralized server to the edge.Hierarchically Deterministic End-to-End EncryptionAll data stored on the Tardigrade platform is end-to-end encrypted from the client side. What that means is users control the encryption keys and the result is an extremely private and secure data store. Both the objects and the associated metadata are encrypted using randomized, salted, path-based encryption keys. The randomized keys are then encrypted with the user’s encryption passphrase. Neither Storj Labs nor any Storage Nodes have access to those keys, the data, or the metadata.By using hierarchically derived encryption keys, it becomes easy to share the ability to decrypt a single object or set of objects without sharing the private encryption passphrase or having to re-encrypt objects. Unlike the HD API keys, where the hierarchy is derived from further restrictions of access, the path prefix structure of the object storage hierarchy is the foundation of the encryption structure.A unique encryption key can be derived client-side for each object whether it’s a path or file. That unique key is generated automatically when sharing objects, allowing users to share single objects or paths, with the ability to encrypt just the objects that are shared, without having to worry about separately managing encryption access to objects that aren’t being shared.Access Management with Macaroon-based API KeysIn addition to providing the tools to share the ability to decrypt objects, the Tardigrade Platform also provides sophisticated tools for managing access to objects. Tardigrade uses hierarchically derived API keys as an access management layer for objects. Similar to HD encryption keys, HD API keys are derived from a parent API key.Unlike the HD encryption keys where the hierarchy is derived from the path prefix structure of the object storage hierarchy, the hierarchy of API keys is derived from the structure and relationship of access restrictions. HD API keys embed the logic for the access it allows and can be restricted, simply by embedding the path restrictions and any additional restrictions within the string that represents the macaroon. Unlike a typical API key, a macaroon is not a random string of bytes, but rather an envelope with access logic encoded in it.Bringing it Together with the AccessAccess management on the Tardigrade Platform requires coordination of the two parallel constructs described above-encryption and authorization. Both of these constructs work together to provide an access management framework that is secure and private, as well as extremely flexible for application developers. Both encryption and delegation of authorization are managed client-side.While both of these constructs are managed client-side, it’s important to point out that only the API keys are sent to the Satellite. The Satellite interprets the restrictions set by the client in the form of caveats, then controls what operations are allowed based on those restrictions. Encryption keys are never sent to the Satellite.Sharing access to objects stored on the Tardigrade Platform requires sending encryption and authorization information about that object from one client to another. The information is sent in a construct called an Access. An Access is a security envelope that contains a restricted HD API key and an HD encryption key-everything an application needs to locate an object on the network, access that object, and decrypt it.To make the implementation of these constructs as easy as possible for developers, the Tardigrade developer tools abstract the complexity of encoding objects for access management and encryption/decryption. A simple share command encapsulates both an encryption key and a macaroon into an Access in the format of an encoded string that can be easily imported into an Uplink client. Imported Accesses are managed client-side and may be leveraged in applications via the Uplink client library.Why Security at the Edge MattersThe evolution of cloud services and the transition of many services from on-premise to centralized cloud has massive increases in efficiency and economies of scale. That efficiency in many ways is driven by a concentration not only of technology, but expertise, and especially security expertise. That efficiency has also come at the cost of tradeoffs between security and privacy. Moreover, many new business models have emerged based almost entirely on the exchange of convenience for giving up the privacy of user data. In the cloud economy, user’s most private data is now more at risk than ever, and for the companies that store that data, new regulatory regimes have emerged, increasing the impact on those businesses if that data is compromised.The Intersection of Cybersecurity Skill and Decentralized DataWhile the transition of on-premise to cloud has brought a reduction in the number and types of hacks, much of the vulnerability of on-premise technology was due in part to a lack of cybersecurity experience and expertise. A big part of the push to Gmail is that fact that it’s much less likely to get hacked than a privately operated mail server.The transition to the cloud has resulted in a much greater separation of security expertise and technology use. The cost of best-in-class security expertise of cloud providers is, like the cost of infrastructure, spread across all customers. One additional consequence of that separation-the loss of cybersecurity expertise-is the lack of appreciation of the resulting tradeoff. That security does not come with transparency, and in fact, many times that security comes in exchange for a loss of privacy.This is where a decentralized edge-based security model provides a similar security advantage but without the tradeoffs against transparency or privacy. With Storj, you get the benefit of the team’s distributed storage, encryption, security and privacy expertise but you also get the full transparency of the open-source software. This ultimately enables the ability not only to trust but to verify the security of the platform, but that’s not where the difference ends. Storj provides all the security benefits of a cloud platform, but provides the tools to take back control over your privacy.Edge-based Security + Decentralized Architecture = Privacy by DefaultClassic authorization technologies are built for client-server architectures. Web-centric authorization schemes such as OAuth and JWT are built for largely synchronous transactions that involve separating the resource owner and the authorization service. Each of these approaches depends for its success on a central authority. To truly maximize privacy and security at massive scale, there is a need to efficiently delegate resource authorization away from centralized parties.Moving token generation and verification closer to the edge of the architecture represents a fundamental shift in the way technologists can create verified trust systems. Having the ability in a distributed system to centrally initiate trust (via API Keys) and extrapolate specifically scoped keys from that trust allows systems to generate their own trust chains that can be easily managed for specific roles and responsibilities. Authorization delegation is managed at the edge but derived based on a common, transparent trust framework. This means that access tokens generated at the edge can be efficiently interpreted centrally, but without access to the underlying encrypted data.Distributed and decentralized environments are designed to eliminate trust by definition. By moving security, privacy, and access management to the edge, users regain control over their data. With tools such as client-side encryption, cryptographic audits and completely open-source architecture, trust boundaries and risk are mitigated not by the service provider, but by the tools in the hands of the user.A Different Approach Delivers Differentiated Value Out-of-the-boxThe Tardigrade Platform’s distributed cloud storage and edge-based security model provide easy tools for building applications that are more private, more secure, and less susceptible to the range of common attacks. With this approach, no incompetent or malicious operator can undermine security. There is no careless administrator, no unethical data mining business model, no misconfigured print server, and no social hack that can undermine data. By embracing decentralization and security at the edge, the system is architected to be resilient. Unlike other cloud storage providers, like the AWS Detective solution, Tardigrade integrates security features which are enabled by default. With the Tardigrade Platform, you don’t pay extra for security and privacy.Reduced Risk — Common attacks (misconfigured access control lists, leaky buckets, insider threats, honeypots, man-in-the-middle attacks, etc.) depend for their success on breaching a central repository of access controls or gaining access to a treasure trove of data. The Tardigrade Platform security model provides a way to architect out whole categories of typical application attack vectors.Reduced Threat Surface — By separating trust boundaries and distributing access management and storage functions, a significant percentage of the typical application threat surfaces is either eliminated or made orders of magnitude more complex to attack.Enhanced Privacy — With access managed peer-to-peer, the platform provides the tools to separate responsibilities for creating bearer tokens for access management from encryption for use of the data. Separation of these concerns enables decoupling storage, access management and use of data, ensuring greater privacy with greater transparency.Purpose-Built for Distributed DataDistributed data storage architecture combined with edge-based encryption and access management stores your data as if it were encrypted sand stored on an encrypted beach. The combination of client-side HD Encryption keys and HD API keys in an easy-to-use platform enables application developers to leverage the capability-based security model to build applications that provide superior privacy and security.Originally published at https://storj.io.
20. 05. 20
General Availability for Ta...
The internet was designed to be decentralized. When you send a message, stream media, or do a video conference, you don’t worry about which routers your data is passing through, who owns them, or whether some may be down. The decentralized model for internet communications has delivered multiple orders of magnitude of improvements in reliability, speed, and price. No one questions the appropriateness of leveraging TCP/IP for enterprise applications.However, leveraging decentralization for enterprise grade compute and storage has never been possible–at least until today. Being first is never easy, but it’s always notable. Today, we’re pleased to celebrate the launch of the world’s first SLA-backed decentralized cloud storage service.Two years ago, we started rebuilding our decentralized cloud storage network from scratch. Our previous network reached more than 150 petabytes of data stored across more than 100,000 Nodes — however our offering struggled to scale beyond those numbers, and we didn’t see it would deliver enterprise-grade parameters. It was a tough decision to make, but we can proudly say our rebuild was well worth the effort.Through an extensive beta period, with thousands of users and Storage Node Operators, we’ve been able to demonstrate enterprise grade performance and availability; we’ve delivered S3 compatibility; we’ve demonstrated 100% file durability for over 10 months, enhanced by the fact that we have cross-geography redundancy by default. Our economic model allows us to offer prices at a fraction of the large providers, while still delivering attractive compensation rates to our Storage Node Operators. We’re also able to offer great channel margins to our partners, which builds a solid foundation for a healthy business at Storj Labs.Perhaps most importantly, our end-to-end encryption and zero-knowledge architecture enabled us to deliver a security model that’s significantly more resilient and prevents anyone (including us) from mining data or compromising user privacy.Launch Gates to Measure SuccessAs we’ve talked about in the past, we put in place a rigorous set of launch gates — and we weren’t willing to proceed with a production launch until:We demonstrated multiple months of consistent progress in durability, availability, and usability.Until we had sufficient capacity and a large enough number of vetted node operators to ensure we could meet demand.Until we had stress tested the system with thousands of users and Node Operators.Until we had the tools, documentation, and libraries available to support our major use cases.Until we had battle tested the system with partners.Until we had been thoroughly vetted by third party security firms.Until we could confidently back our service with enterprise grade Service Level Agreements (SLAs).Tardigrade Connectors Ready for UseWe’re proud to be launching Tardigrade along with connectors that allow users to integrate our service with some of the most popular and innovative open source and cloud applications in the world. In addition to our thousands of users, Kafkaesque, Fluree, Verif-y, and CNCTED are just a few of the partners with Tardigrade integrations. We have many more partners finalizing integrations each week. And, we’re launching with not only built in S3 compatibility, but an extensive library of bindings, including .Net, C, Go, Python, Android, Swift, and Node.js, that can enable developers to build Storj-native applications to take advantage of our full range of advanced features, such as macaroons.Perhaps my favorite part about Tardigrade is that the platform is supporting the open source community. I’ve spent the past 15 years working for open source projects and have seen first-hand many of the challenges they face. If you’re an open source project whose users store data in the cloud, you can passively generate revenue every time your users upload data to Tardigrade. Through our Open Source Partner Program, any open source project with a connector will receive a portion of the revenue we earn when your users store data on Tardigrade. There are no limits. There is no catch. We want to support open source because we ourselves are open source software developers.To our amazing community; thank you immensely for supporting us throughout this rebuild journey. We’re proud to say we met most of our deadlines, but we thank you for your patience and hope you love Tardigrade as much as the team here at Storj Labs does!Storj Labs is among a myriad of companies building the infrastructure to power decentralized web 3.0If you haven’t given Tardigrade a try, take a few minutes, sign-up for an account to receive a free credit, and upload a file. See the power of decentralization for yourself.By Ben Golub on BusinessOriginally posted at https://storj.io on March 19, 2020
20. 03. 19
What to Expect in Production
20. 03. 01
Announcing Early Access For...
Today our team is thrilled to announce our Tardigrade decentralized cloud storage service is finally ready for production workloads — we can’t wait for you to try it out. We’re welcoming the first paying customers to experience the advantages of decentralized cloud storage with an early access release (RC 1.0). We expect a complete production launch this quarter, at which time we’ll remove the waitlist and open user registration to all.With this release, Tardigrade users can expect:Full service level agreements: For both our early access release and our production launch, Tardigrade users can expect 3 9s of availability (99.9%) and 9 9s of durability (99.9999999%).1TB credits for storage and bandwidth: All users who signed up for our waitlist will receive this credit after adding a STORJ token balance or credit card to their account. All waitlist credits will expire after one full billing cycle after our production launch, so claim them now so you don’t miss out! After users utilize their credits, they’ll be charged for their usage.1TB limits on storage and bandwidth: During this early access period, all accounts will have limits of 1TB for both their static storage usage and their monthly bandwidth. Submit a request through our support portal if you need to increase this limit.Backward compatibility: Developers building on top of Tardigrade can expect their applications to have full backward compatibility with the general availability production launch.If you haven’t yet, sign up now to get your credit before time runs out! If you’ve already joined the waitlist, check your inbox for an email with details on how to get started. If you’re already a Tardigrade user, first, thank you very much and second, your account will be credited with 1TB of storage and bandwidth after you add a STORJ token balance or a credit card. Users who have both a credit card and a STORJ token balance will first see their STORJ token balance charged, with their credit card as the secondary method of payment. Even after users exhaust their credits, they’ll still pay less than half the price of traditional cloud storage for their usage.Over the past six months, our team has quietly been gathering feedback from customers pilots and POCs, as well as data from network stress tests. We’re confident our first initial partners and customers, as well as users who are joining from the waitlist, will have a positive experience trying out Tardigrade. As an extra measure to ensure the quality of that experience, we’re being extremely vigilant in balancing the network in terms of supply and demand. Over the coming weeks, we’ll continue to gather data and feedback from our initial customers. Once we’re fully confident we can scale that quality experience to meet the anticipated demand, we’ll announce general availability, remove the waitlist, and allow anyone to sign up to experience Tardigrade first-hand.General Availability Release TimingBetween now and production not much will change in the system, other than a significant increase in data to be uploaded by our first paying customers.We’ve previously talked about the qualification gates we use to ensure the product is ready to support our customers’ cloud storage needs and deliver a solid user experience, both of which are critical to drive adoption of the platform. We established these gates to guarantee that we delivered not only an enterprise-grade technology solution but also a world-class customer experience. Since establishing these launch gates, we’ve continuously monitored and reported upon our progress. At every major milestone, we’ve evaluated our progress toward the gates, and the applicability and validity of those gates. As part of this new early access phase, we’ve added an additional qualification gate, which is to deliver two weeks of success with real-world users (and their data).Tardigrade Adoption During Beta 1 and 2To date, we’ve welcomed 10,000+ Tardigrade waitlist members to the platform. There is more than 4 PB (4,000 TB) of data stored with 20 PB of available capacity. Thousands of developers have created accounts, projects, and uploaded data to the network. In addition to these developers, we’ve been working with a number of large, notable partners and customers who have large-scale use cases to prove that decentralized cloud storage is the right solution for their needs. We’ll be sharing more about these customer’s use cases in the coming months.During this early access phase, we’ll provide an extra level of support to early customers and developers, gather further feedback, and continue to refine the onboarding process. We’re doing this so that when we actually move forward with mass adoption, we’ll be ready — and so will Tardigrade.Early Access vs General AvailabilityOnce we announce general availability, users will be able to sign up directly on Tardigrade.io, after which they’ll receive immediate confirmation for their account. During early access, we’ll be sending out invites once a week to continue to throttle new user registrations.As mentioned before, those users that do have access will have their limits raised to 1TB for both bandwidth and storage after they add a method of payment. During general availability, limits will start at 5GB once a credit card is added, and limits will increase from there. So sign up now to reserve your higher limit for our production launch.Our Storage Node Operators won’t experience much of a difference between early access and general availability other than a steady increase in the amount of customer data and bandwidth utilization on the network. We have a lot of upcoming features planned for the storage nodes including SNO board enhancements, configuration tools and improvements to graceful exit . We have some exciting news to share about our team building out features for Storage Node Operators on the Q1 2020 Town Hall, so make sure to tune in.After our general availability release, we’ll share more information about our 2020 roadmap, but we’ve been very impressed by the amount of feedback we’ve received through the ideas portal and on the forum — we’ve actually incorporated many of the suggestions into our plans. If you have additional suggestions, please share them. We review every single suggestion. You can also see other ideas that have been submitted and the status of suggestions.We want to give a HUGE thank you to our community of amazing Tardigrade users and Storage Node Operators for all your contributions to the network. We literally couldn’t build the decentralized future without you and your efforts.By John Gleeson and JT Olio on BusinessOriginally published at https://storj.io on January 30, 2020
20. 03. 01
Use Cases for the Decentral...
Have you heard the news? Tardigrade is in early access production, which means that developers can start using the decentralized cloud storage service for real workloads. This is a huge milestone for the company and network, and we’re excited to have you along for the journey.Tardigrade BenefitsTardigrade is superior to centralized alternatives, for a number of reasons. First off, we’re more durable. Decentralization means there is no single point of failure, and the risk of file loss is dispersed through statistical uncorrelation.Data is streamed, hyper-locally and in-parallel, enabling us to be much faster than centralized competitors. Because our economics are similar to that of Airbnb (or Uber) — we’re also able to sell storage at half the price of AWS.While decentralized cloud storage is awesome and highly optimized for some use cases, it isn’t a perfect fit for everything. Tardigrade object storage is highly optimized for larger files, especially those which are written once and read many times.Use Cases for TardigradeYou may be wondering how you can get started? Here are a few specific use cases that are well suited for decentralized cloud storage:Large File Transfer: Tardigrade is especially well suited for transiting large amounts of data point-to-point over the internet. High-throughput bandwidth takes advantage of parallelism for rapid transit — client-side encryption ensures privacy during transit.Common examples include large files, academic/scientific datasets, binaries, media collections, and the like. Developers aren’t charged for upload bandwidth, so uploading files to the network doesn’t incur any cost, and there are no penalties if you decide to take your data with you and run.For a great example of large file transfer, see transfer.sh. In an average month, transfer.sh is used more than 1,000,000 times to easily transfer files worldwide.Database backups: Storing backups and snapshots of databases is another use case that is specifically well suited for decentralized storage. Regular snapshot backups of databases for data recovery or testing are an entrenched part of infrastructure management. They enable you to quickly capture the state of your database at a given point in time and capture the change in the database from the backup to the present.On the decentralized cloud, streaming backups eliminates the need to write large database snapshots to local disk before backup or for recovery.Low volume CDN: Out of the box, Tardigrade supports the fluid delivery of multimedia files with the ability to seek to specific file ranges and support for large numbers of concurrent downloads.On the decentralized cloud, native file streaming support and distributed bandwidth load across highly distributed nodes reduces bottlenecks.Multimedia Storage: Another common use case is for the storage of large numbers of big multimedia files, especially data produced at the edge from sources like security cameras that must be stored for long periods of time with low access.Rapid transit leveraging parallelism makes distributed storage effective for integrating with video compression systems to reduce the volume of data stored.Private Data: Tardigrade is highly optimized for data that is highly sensitive and an attractive target for ransomware attacks or other attempts to compromise or censor data.Client-side encryption, industry-leading access management controls, and a highly distributed network of storage nodes reduce attack surface and risk.Back-end to dApps: A dApp backed by centralized cloud storage means you’re missing out on the biggest benefits of decentralization. Using Tardigrade as the back-end to your dApp increases its privacy, security, and resiliency when compared to legacy, centralized cloud storage solutions.Get StartedWe expect there are many more ways to incorporate Tardigrade decentralized cloud storage into your applications and cloud environments. Ready to get started and see what Tardigrade can do for you? Follow our Tardigrade documentation to create your first project and upload your first file in just a few minutes.By Kevin Leffew on TutorialsOriginally published on https://storj.io on February 11, 2020
20. 03. 01
How deletes affect performa...
Our team here at Storj Labs is currently in the middle of adding support for CockroachDB, which is a horizontally scalable Postgres compatible database. Each database technology behaves differently and it’s important to understand the tradeoffs they make to utilize them efficiently. Along the way, we are learning a lot about CockroachDB and one of the things we have learned is how deletes impact database performance.PerformanceCockroachDB uses a technique called MVCC ¹ to manage concurrent transactions. Deletes leave behind tombstones to mark records as deleted. Deleted records don’t get removed from the primary key/value store or indices until the gc_ttl grace period window expires, which has a default of 25 hours. This means any query using the table has to process more data than you may expect if you assumed all those records were immediately removed from the system. I want to stress, this doesn't violate any transactional guarantees and doesn't return any undesired results. Deleted data appears deleted correctly. This only affects performance. If you're doing sparse deletes this probably won't be noticeable. If you're doing bulk deletes you may notice performance doesn't improve after you have issued deletes until the 25-hour window has expired and has purged the bulk deleted records. Old values changed with updates also get purged when the gc_ttl grace period has expired.Another thing to consider with CockroachDB deletes is that if you issue too large of a delete statement you may experience a query too large exception. To work around this you can delete records with a limit or some continuous range of the primary key.Some techniques to consider to mitigate these side effects-if you’re experiencing this problem-could be lowering the gc_ttl 25-hour interval. If you're using the enterprise version of CockroachDB you can use partitions, or alternatively views if you're not using the enterprise version. Truncating entire sections of a table is also an option. This avoids tombstones, but requires you to define sufficiently coarse-grained segments when you do the inserts.Thanks, Namibj from the Cockroach Slack for this information.You can read more details on the official CockroachDB documentation ².“CockroachDB relies on multi-version concurrency control (MVCC) to process concurrent requests while guaranteeing strong consistency. As such, when you delete a row, it is not immediately removed from disk. The MVCC values for the row will remain until the garbage collection period defined by the gc.ttlseconds variable in the applicable zone configuration has passed. By default, this period is 25-hours.This means that with the default settings, each iteration of your DELETE statement must scan over all of the rows previously marked for deletion within the last 25-hours. This means that if you try to delete 10,000 rows 10 times within the same 25-hour period, the 10th command will have to scan over the 90,000 rows previously marked for deletion.If you need to iteratively delete rows in constant time, you can alter your zone configuration and change gc.ttlseconds to a low value like five minutes (i.e., 300), and run your DELETE statement once per GC interval. We strongly recommend returning gc.ttlseconds to the default value after your large deletion is completed.”Why gc_ttl existsThis 25-hour window exists to help support long-running queries and the AS OF SYSTEM TIME³ clause that enables querying a specified time in the past. Another purpose is for restore. Data is kept around for a while so that you can restore back to a point in time. For example, if you ended up deleting more data than you intended with a bad where clause, restore can put the table back to where it was before.BackupsIt’s important that you do an incremental backup at least once within the gc_ttl time window.References¹ Multiversion concurrency control. https://en.wikipedia.org/wiki/Multiversion_concurrency_control² Why are my deletes getting slower over time. https://www.cockroachlabs.com/docs/stable/sql-faqs.html#why-are-my-deletes-getting-slower-over-time³ AS OF SYSTEM TIME https://www.cockroachlabs.com/docs/stable/as-of-system-time.htmlBy Simon GuindonOriginally published at https://storj.io.
19. 12. 20
Secure access control in th...
When the tech industry began the transition to cloud-based resource provisioning, the attack/security vectors in which DevOps and CISOs focus on to protect their resources, shifted along with it.Suddenly, protecting users’ data required a fundamentally new approach to containing resources. Rather than simply “defending the perimeter” (through ownership of network infrastructure, firewalls, NICs etc.) the model shifted to an identity-based approach to control access to systems and resources.This practice has become known as Identity and Access Management (IAM), and defines the way users authenticate, access data, and authorize operations in a public cloud environment.When it comes to authorization and authentication on the web, the standard public cloud approach is through Access Control Lists (ACLs). However, the capability-based approach leveraged by decentralized networks is indisputably more secure, and I will explain why in this blog post.Core Problems with Public Cloud’s ACL modelThe ACL model, sometimes referred to as the Ambient Authority Model, is based on user identity privileges (for example, through Role-based Access Control).The ACL keeps a list of which users are allowed to execute which commands for on an object, or file. This list of abilities is kept logically separate from the actual identity of the users.The appeal of ACLs partially arises from a notion of a singular “SuperAdmin” being able to list and fully control every user’s account and privileges.This centralized approach to control creates a massive honeypot for hackers, because when the SuperAdmin loses control, the entire system falls apart.Because the ACL model defines access through the user-agent identity (or abstractions like roles, groups, service accounts etc.), each resource acquires its access control settings as the result of a superuser administrator making deliberate access configuration choices for it.This is a major weakness of the ACL approach, especially within todays’ massively parallel and distributed systems, where resources are accessed across disparate operating systems and multiple data stores.Essentially, the ACL model associates users to files, and controls permissions around them.The Access Control List Approach fails for two reasons:Failure 1: ambient authority trapAn authority is “ambient” if it exists in a broadly visible environment where any subject can request it by name.For example, in Amazon S3, when a request is received against a resource, Amazon has to check the corresponding ACL (an ambient authority) to verify that the requester has the necessary access permissions.This is an unnecessary extra hop in the authentication process that leads to ambient authority. In this scenario the designation of the authority (the user) is separated from the authority itself (the access control list), violating the Principle of Least Authority (POLA).Furthermore, IAM systems based on the ACL model fall into the ambient authority trap — where user roles are granted an array of permissions in such a way that the user does not explicitly know which permissions are being exercised.In this design flaw, inherent to many public cloud platforms, user-agents are unable to independently determine the source, or the number/types of permission that they have, because the list is held separately from them on the ACL. Their only option is through trial and error, making a series of de-escalated privilege calls until they succeed.To invoke an analogy, this is like using a personal, unmarked key to open a series of infinite doors. You don’t know which door will open until you try it. Very inefficient!As a result, If agents cannot identify their own privilege set, they cannot safely delegate restricted authority on another party’s behalf. It would be risky for someone to lend a key to a neighbor, not knowing which of my doors it might open.In the world of operating systems and mission-critical distributed systems, avoiding ambient authority privilege escalation is crucial, especially when running untrusted code.Every application today is launched with grossly excessive authority to the users operating systems. This is why many systems implement FreeBSD jails like Capsicum and Linux Docker containers to sandbox software.Google is even working on a new capability-based operating system called Fuchsia to supercede the Linux Android kernel.Failure 2: confused deputy problemA deputy is a program that manages authorities coming from multiple sources. A confused deputy is a delegate that has been manipulated into wielding its authority inappropriately.Examples of the Confused Deputy Problem can be found across the web. These include injection attacks, cross-site request forgery, cross site scripting attacks, click-jacking etc. These attacks take advantage of ambient authority to use the victim’s existing program logic to nefarious ends in web applications.In order to avoid the Confused Deputy Problem, a subject must be careful to maintain the association between each authority and its intended purpose. This is wholly avoided by the capability-based model described below.Capability-based security is betterFrom a security-design standpoint, the capability model introduces a fundamentally better approach to identity and access management than Public Cloud’s ACL framework.By tying access to keys, rather than a centralized control system, capability-based models push security to the edge, decentralizing the large ACL attack vector and creating a more secure IAM system.The capability-based model solves both the ambient authority trap and the confused deputy problem by design.What is a capability?Often referred to as simply a ‘key,’ a capability is the single thing that both designates a resource and authorizes some kind of access to it. The capability is an unforgeable token of authority.Those coming from the Blockchain world will be very familiar with the capability-based security model, as it is the model implemented in Bitcoin where “your key is your money” and in Ethereum where “your key is gas for EVM computations”.This gives the client-user full insight into their privilege set, illustrating the core tenet of the Capability Mindset: “ don’t separate designation from authority. “.Similar to how in the Blockchain world, “your keys are your money,” with Tardigrade, your keys are your data, and macaroons add additional capabilities that allow the owners of data to caveat it, or granularly delegate access for sharing, programatically.Key-based ownership of object data will enable users to intuitively control their data as a first principle, and then delegate it as they see fit. The decentralized cloud eliminates the increasingly apparent risk of data loss/extortion due to holding data on one single provider (like Amazon, Google, or Microsoft).Storj, with its Tardigrade service, presents a better model where object data is encrypted, erasure-coded, and spread across thousands of nodes stratified by reputation whereby any and every computer can be the cloud.Macaroons are the key innovationMacaroons enable granular, programmatic authorization for resources in a decentralized way.The construction of macaroons was first formulated by a group of Google engineers in 2014. These chained, nested constructions are a great example of the capability-based security model and are deeply integrated into the V3 Storj Network.Macaroons are excellent for use in distributed systems, because they allow applications to enforce complex authorization constraints without requiring server-side modification, making it easy to coordinate between decentralized resource servers and the applications that use them.Their name, “MAC-aroons”, derives from the HMAC process (hash-based message authentication code) by which they are constructed, while also implicitly alluding to a claim of superiority over the HTTP cookie.In practice, HMACs are used to simultaneously verify both the data integrity and the authentication of a message.Similar to the blocks in a blockchain, HMACs are chained within a macaroon (whereby each caveat contains a hash referring to the previous caveats), such that caveats that restrict capabilities can only be appended, and not removed.Macaroons solve the cookie-theft problem associated with OAUTH2 and traditional cloud services by delegating access to a bearer token that can only be used in specific circumstances through HMAC chained ‘caveats’ (i.e. restrictions on IP, time-server parameters, and third- party auth discharges). These caveats can be extended and chained, but not overwritten.Capability-security in the Tardigrade NetworkIn the Tardigrade Network, macaroons are referred to as API Keys, and enable users to granularly restrict and delegate access to object data in a way that is decentralized and more secure than existing cloud solutions.From a developer standpoint, Capabilities make it very easy to write code that granularly defines security privileges. Once baked, the rules within the capability cannot be changed, without reissuing the key itself.Access management on the Tardigrade platform requires coordination of two parallel constructs — Authorization and Encryption. With macaroons, both of these constructs work together to provide an access management framework that is secure and private, as well as extremely flexible for application developers.A macaroon embeds the logic for the access it allows and can be restricted, simply by embedding the path restrictions and any additional restrictions within the string that represents the macaroon. Unlike a typical API key, a macaroon is not a random string of bytes, but rather an envelope with access logic encoded in it.To make the implementation of these constructs as easy as possible for developers, the Tardigrade developer tools abstract the complexity of encoding objects for access management and encryption/decryption ( https://godoc.org/storj.io/storj/lib/uplink#hdr-API_Keys).Macaroons in actionWhile the possibilities for access controls that can be encoded in a caveat are virtually unlimited, the specific caveats supported on the Tardigrade Platform are as follows:Specific operations: Caveats can restrict whether an API Key can perform any of the following operations: Read, Write, Delete, ListBucket: Caveats can restrict whether an API Key can perform operations on one or more BucketsPath and path prefix: Caveats can restrict whether an API Key can perform operations on Objects within a specific path in the object hierarchyTime window: Caveats can restrict when an API Key can perform operations on objects stored on the platformFor some sample Go code around access-restriction, check out https://godoc.org/storj.io/storj/lib/uplink#example-package--RestrictAccessConclusionMacaroons are a great example of capability-based security models in action, and Storj is a shining example of their implementation in decentralized cloud protocols.In Storj, we refer to our implementation of macaroons (HMACs) as simply API Keys. Using macaroons as a construct for API keys is innovative and useful because of their:Speed: HMACs are very fast and lightweightTimeliness: Can require fresh credentials and revocation checks on every requestFlexibility: Contextual confinements, attenuation, delegation, and third-party caveatsAdoptability: HMACs can run everywhereOne of the best ways to learn about capability-based models is to try them in action.Sign up for the developer waitlist, join our community forum, and let us know what you think!—By Kevin LeffewThanks to Noam Hardy and JT Olio.Sourceshttp://srl.cs.jhu.edu/pubs/SRL2003-02.pdf http://zesty.ca/zest/out/msg00139.html http://cap-lore.com/CapTheory/ConfusedDeputy.html https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41892.pdfOriginally published at https://storj.io.
19. 12. 05
How to Measure Whether Your...
By Ben Golub, Storj LabsSo, you’re about to launch a product that will be used for business-critical use cases. Everyone speaks about being “production-ready,” and “enterprise-grade.” How can you make those vague concepts concrete?There is an old saying that “You never get a second chance to make a first impression.” In the enterprise storage world, the corollary is: “You never get a second chance to be trusted with someone’s data.” With Tardigrade, we are asking users to consider storing production data on a decentralized service. So, what does it mean to be “production grade” and how can you measure if you’ve achieved it?To measure whether your new service is ready for the world to start using is not as easy as it sounds. You have to answer a number of questions, including but not limited to:Who is the intended user and what are the intended use cases?Is the product easy for your intended user to incorporate into their environment?Is the product comparable to or better than existing alternatives?Does the product deliver rigorously measurable performance and reliability?What may be production-grade for one person might not be production-grade to another. But what’s clear is that the concept of being production grade goes far beyond delivering a specific amount of code or features.When we set out to build a new type of cloud storage, we knew it would be a challenge. We needed to ensure the highest levels of performance, reliability and resiliency straight from the beginning. People aren’t forgiving when it comes to data loss. (Surprise!) To measure whether our decentralized cloud storage solution has met the needs of our customers and has the performance and durability we can stand strongly behind, we created a series of metrics we can measure that are important to our platform, our company, and our community of developers.We call these metrics “launch gates” and for us to enter a new milestone-from Alpha to production-each gate must be achieved. Today, we announced our Beta 2 (Pioneer 2) release, which is our final milestone before our production launch in January 2020.Here are the details on where we currently are, how it compares to our Beta 2 launch gates, and what we need to do to reach production.AvailabilityOur availability measure is a full-stack end-to-end measurement of all of our systems being responsive and performing requested operations. We do this by uploading and downloading 10MB files. We test a randomly selected segment for each Satellite every minute. We already had good availability for the last few months, but after some exciting changes to our protocol, architecture, and operational processes 3 weeks ago, we have seen a success rate of 100%. If a file fails an availability check, it doesn’t mean the file is gone (we’ve never lost a file); it just means a second attempt is necessary to download it. Our production goal is 99.995% availability (i.e., the service should only be unavailable for two minutes in any given month, or about four seconds per day) so once we have sufficient statistical history on our new architecture and process, we feel confident we can achieve this.File DurabilityFile durability measures the likelihood that one of your files could go missing. This is especially challenging to calculate when you’ve never lost a file. We are very proud of the fact that, since our cutover to version 0.1.0 seven months ago, we haven’t lost a single file. Some might think we could claim 100% durability. However, our meticulous data scientists remind us that, statistically speaking, we need to have had billions of file-months of 100% durability in order to state, with 95% confidence, that we’ll have 99.99999999% durability. So, applying our current level of 2 million files at 100% durability for seven months, the statistical model yields 99.9999% (6 9s) durability. (i.e. The statistical likelihood of losing a file is 1 in a million, far less than the chances of being struck by lightning this year.) For production, we are aiming to get to 99.9999999% (9 9s) durability, while our long-term goal is to reach 11 9s of durability, but we will need to maintain the current level of 100% durability for significantly more files for at least one year to be able to officially make that guarantee.Now, you may be wondering how we achieve this level of durability. Each file on the network is encrypted (using keys only held by the user uploading the data) and then divided into 64MB chunks called segments. These segments are then each encoded into 80 pieces using Reed Solomon error-correcting code, which enables us to rebuild the segment from any 29 of its 80 pieces. Each of those 80 pieces is stored on a different Node, with independent power supply, different geographical locations, different operators, and different operating systems. The system as a whole continually monitors those Nodes for uptime and issues cryptographic audits to ensure that the Nodes are storing what they claim to be storing. The system knows, for every single segment, how many of those 80 pieces are currently on Nodes that are online and have passed all required audits, Whenever a segment drops below a number of pieces equalling or exceeding the repair threshold (currently 52 based on statistical models), the system rebuilds those missing pieces and sends them to new Nodes. As you can see from our segment health distribution chart below, we’ve never had a segment drop below 50 of its 80 pieces. To lose a segment, we’d have to drop below 29. As we add more Nodes to the network, this should continue to improve.Upload PerformanceTo measure upload performance, we calculate the time it takes to upload a 10 MB file and then we repeat this test 250 times. We then compare the results to AWS, which is generally considered the gold standard for centralized cloud service providers. We not only look at the median time to upload, we also look at the long tail performance. Across a broad range of file sizes and locations, we are comparable to AWS. For a Beta, that’s pretty encouraging. Moreover, we have a really tight distribution. Our 99th percentile upload time is only 0.37 seconds slower than our 50th percentile time (i.e. the slowest 5 files uploaded almost as fast as the 50 fastest), versus a 5.35 second differential between the 50th and 99th percentile on AWS. This consistency and predictability is due to the inherent benefits of decentralization and should only get better as we add more Nodes and continue to add Nodes that are distributed closer to the end-users (ultimately, the speed of light becomes a factor).The above results are uploading from a location in Toronto (i.e. Eastern US) in conjunction with a Satellite in Iowa (i.e Central US).Download PerformanceWe measure the time it takes to download and reconstitute a 10 MB file. We repeat this test for 250 times, and then compare the results to AWS. Across a broad range of file sizes and locations, we are comparable to AWS. We’re especially excited about our tight distribution. Our 95th percentile time is only .26 seconds slower than our median time. (i.e. the slowest 5 files uploaded almost as fast as the median). Again, this points to the power of decentralization and should only get better as we add more Nodes and continue to add Nodes that are distributed closer to end-users.The above results are downloading to a location in Toronto (i.e. Eastern US) in conjunction with a Satellite in Iowa (i.e Central US).Proven CapacityProven capacity measures the total available capacity shared by all of our Storage Node Operators. Our Storage Node Operators have offered up over 15 PB of capacity. Using very conservative statistical models, we can state with a 95% confidence that we have at least 7 PB.Note that while we believe the capacity of the network is significantly higher, we hold ourselves to this more conservative, proven number. This number is more than an order of magnitude lower than the V2 network, which had a capacity of 150 PB at its peak. While we have several partners and Beta customers with several petabytes of capacity, we’re aiming to grow the network more gradually, so that we generally only have about three months of excess capacity at a time. This helps us ensure that all Nodes are receiving economically compelling payouts.Number of Active NodesThis is the number of currently Nodes currently connected to the network. The number of active Nodes in the table above (over 2,956 Nodes now), excludes any Nodes temporarily offline, any Nodes that have voluntarily quit the network, and any Nodes that have been disqualified (e.g. due to missing uptime or audit requirements). Our production goal is 5,000 Nodes-still a fraction of the V2 number. Once a Node is vetted, if it ceases to participate in the network, it contributes to our vetted Node churn metric (see below).Vetted Node ChurnOur current vetted Node churn (excluding probationary Nodes) is 1.18% for the last 30 days. Our Beta 2 gate is 3%, and our production gate is 2.0%, so we’ve already hit this production-level metric. Our system is very resilient to having any individual Nodes (or even a significant percentage of Nodes) churn. However, performance, economics, and most statistics all do better as we bring average Node churn down.Other GatesWe have a wide variety of other gates. These include gates around code quality and test coverage, user and storage Node operator set up success rates, various capabilities, payment capabilities, peak network connections, and more. We also have gates around enablement of non-Storj Labs Tardigrade Satellites, and are aiming to be “Chaos Monkey” and even “Chaos Gorilla” resilient before production.We hit all seven gates for Beta 2, as well as an additional two gates for production. For the past 30 days, we’ve had 99.96%% availability on our service, which includes 9 deployments all with zero downtime.Measure What MattersWhen you’re trying to measure something that’s somewhat nebulous like being “enterprise-grade,” try distilling the goals down to their core parts and measure what matters most to you, your business, and your customers. If you can do that, you can measure your progress and continually make improvements to your offerings.We’ll continue to measure the performance of the network over the next several weeks and if everything goes according to plan, we’ll be in production in January 2020. Stay tuned for further updates. You can also sign up to try out the network for yourself! Everyone who signs up ahead of production will receive a credit worth 1 terabyte of cloud storage and 333 of download egress on Tardigrade.Originally published at https://storj.io.
19. 11. 19
Architecting a Decentralize...
GitBackup is a tool that backs up and archives GitHub repositories. The tool is in the process of backing up the entirety of GitHub onto the Storj network, which currently stands at 1–2 PB of data. As of today, October 18, 2019, the tool has currently snapshotted 815,200 repositories across more than 150,000 users.GitHub is the largest store of open source code in the world, with 20 million users and more than 28 million public repositories as of April 2017.We believe that this reservoir of free and open source code acts as a digital version of a public good, similar to a developers’ library — a library that empowers software engineers to access the collective knowledge around open source code, development patterns, and free software.While GitHub is a wonderful service, it’s owned by an agenda-driven global corporation and is thus prone to downtime, blockage, and censorship by a single point of failure. For example, Microsoft’s acquisition of LinkedIn shows how user content can be gradually taken away (by means of paywalls and login walls).Furthermore, on July 25, 2019, for example, a developer based in Iran wrote on Medium about how GitHub blocked his private repositories and prohibited access to GitHub Pages. Soon after, GitHub confirmed that it was blocking developers in Iran, Crimea, Cuba, North Korea, and Syria from accessing private repositories.If we want to guarantee the preservation of the work of hundreds of thousands of open source developers, we need to act now!Let’s download it all!We’re currently using gharchive.org to get a list of GitHub usernames that have had a public action since 2015. So far the 815,200 repositories we’ve backed up constitutes about 80 TB of data. We anticipate that the entirety of public GitHub repos is about 1–2 PBs so we still have a way to goIf you want to backup your codebases’ repository (or all of GitHub) to the decentralized cloud, check out the tool, found here:http://gitbackup.org/Gitbackup was built by Shawn Wilkinson in collaboration with a number of Storj Labs’ engineers and community members. The tool was demonstrated on October 11 at Devcon V (Osaka, Japan).By Kevin Leffew on CommunityOriginally published at https://storj.io on October 18, 2019
19. 10. 18
IPFS Now on Storj Network
Developers have been clamoring for a decentralized storage solution for pinning data on IPFS — and the Storj community has answered the call with storjipfs.com.The IPFS protocol is popular with decentralized app developers as a way to address content from its hash output. While it’s merely a way to address files across a DHT (or network of Kademlia nodes), it’s usually deployed with Amazon S3 or local storage on the backend. What this means is that decentralized apps using IPFS without pinning to a decentralized storage backend aren’t all that decentralized.Any time a file is uploaded to an IPFS node, there’s no guarantee the file will persist longer than a few minutes (unless self-hosted on reliable hardware, or backed by a centralized cloud provider). The users of services built on the IPFS network face issues where the data they’re trying to access and share is no longer hosted by any nodes. The reality of IPFS is best illustrated through the IPFS subreddit — many of the links are dead, because their hosts have gone offline.We’re excited to announce the availability of a reference architecture that backs an IPFS node to the Tardigrade decentralized cloud storage service. This guarantees the persistence, distribution, and security of content-addressed data.You can now upload files to the Storj network through the IPFS system by going to storjipfs.com.Our community members created this impressive project and we can’t thank them enough for their efforts.Traditional IPFS architecture requires copies of a file spread amongst multiple hosts to achieve redundancy measures. While the theory is interesting, in practice, the approach just doesn’t produce the performance and availability required in modern applications. Instead of replicating files to multiple hosts and relying on a single host for file delivery, Storj uses erasure coding and peer-to-peer parallel delivery. We don’t just think our approach is better, the math proves it.When an IPFS node is backed by the Tardigrade Network, we are able to solve many of the problems that IPFS developers face; problems including data decentralization, data persistence, and default encryption at rest. The Storj network architecture has a native repair system built into it that ensures files remain alive, even when nodes go offline. This reference implementation provides IPFS addressability with durability and reliability on par with the best centralized cloud providers in the industry.The IPFS gateway isn’t the only solution for developers on the Storj network. When it comes to reliable, performant, secure, and economical storage for decentralized apps, the native Storj platform is the best option. Storj offers a wide range of developer tools, including a CLI, S3-compatible gateway, and Go library with language bindings for C, Python, Node.js, .NET, Java, Android, and Swift.To gain access to the Storj network and Tardigrade Service, sign up for the developer waitlist here: https://tardigrade.io/waitlist.By Kevin Leffew on BusinessOriginally published at https://storj.io on October 6, 2019
19. 10. 07
비밀번호 변경 안내
사용자님의 안전한 정보 보호를 위해 3개월마다 비밀번호 변경을 권장하고 있습니다. 비밀번호를 변경하여 주시길 바랍니다.